Finding the right level of DRY for your RSpec test suite

From shared contexts to let blocks, we have lots of options for DRYing out our tests in the RSpec world. But how far do we take it? The appeal comes from that need for clever perfection, but I don't think that always works for your test suite.

This is how to give some focus to what exactly is tested rather than completely zeroing in on how it's tested because that frequently yields duplication and inefficiency.

A recent hefty refactor has reminded me that knowing when and when not to DRY up pieces of your tests can make them an order of magnitude easier to change. I'll point to (1) a good chunk of experience in both the DRY and non-DRY camp, (2) the opinions of my current and former colleagues, and (3) my own research around the net (i. e. Joe Ferris on thoughtbot) to conclude that generally we should not be concerned about allowing some duplication.

Following complex code paths can make our brains melt. Simple, readable tests greatly assist us in bringing us back to the moment, reminding us of the nuances of our implementation. That's why leaning too far into the DRY camp when writing them can drop you right back at square one.

This article is geared toward small-to-medium size applications. At enterprise scale, testing is a whole different beast and quite probably its own department.

Ease of reading & refactoring

Over at xUnit Patterns, Gerard Meszaros identifies the concept of the "mystery guest":

When either the fixture setup and/or the result verification part of a test depends on information that is not visible within the test and the test reader finds it difficult to understand the behavior that is being verified without first having to find and inspect the external information, we have a Mystery Guest on our hands.

Simply put, hunting for a piece of a test is not fun. For instance, what if you came across this in a review?

include_context 'creates a car', 'hubcaps' do
  before do
    expect(VehicleSettings).to receive(:get).with(:wheel_size) { :r15 }
    expect(VehicleSettings).to receive(:get).with(:hubcap_type) { :standard }
  end
  let(:expected_value) { 42 }
end

You have to find the shared_context and follow not only what's passed, but also what method expectations are being overridden in the passed block, and how expected_value is used.

When we have to jump to references in several places, in and out of the current file, our brain gears turn extra hard to fully understand the example. It makes much more sense to centralize dependencies of the test as much as possible. We should make an example's inputs as clear as possible.

Efficiency of setup data

But centralizing dependencies could lead to repetitous set-up, you may say. So you abstract away the setup and use it for two examples, which works great until just one of your objects needs a slight change. No problem, just tweak it:

it 'does something' do
  subject.reload
  subject.update!(another_attribute: 'foo')

  expect(subject.something_type).to eq 'red'
end

Not really-- last-minute alterations like this are commonly the result of overly DRY tests. It's better to define what the test requires upfront, in a new let block or test variable. Otherwise, it's easy to create extraneous data. An abstracted set of data may provide a subset of what the test actually requires, but realize that often comes with lots of other unrelated stuff.

Take this:

describe 'properties of a car' do
  let(:car) { create(:car) }

  it 'has wheels' do
    expect(car.wheels).to be_present
  end

  it 'has an engine' do
    expect(car.engine).to be_present
  end

  it 'has doors' do
    expect(car.doors).to be_present
  end
end

Idiomatic RSpec at first glance, but suppose that the :car factory pulls in lots of dependencies. The way this is written, that expensive call will happen three times. Contrast with:

describe 'properties of a car' do
  let(:car) { create(:car) }

  it 'has wheels, an engine, and doors' do
    expect(car.wheels).to be_present
    expect(car.engine).to be_present
    expect(car.doors).to be_present
  end
end

This only builds the test data once (and is less LOC!). Anyone who has spent considerable time with factories knows they require continuous optimization efforts. By their nature, they become heavy and bloated as you add more associations and callbacks.

Consolidate examples that use heavy factories when its assertions don't require fresh state.

Aim to create the minimal amount of test data for the particular example in its own right. Not only should anyone be able to easily understand the setup, but your tests should be as efficient as possible. One extra UPDATE may seem innocent, but it likely brings along this hidden cost that's too easy to ignore. Even a slight performance win is worth it because those savings will likely compound over many usages. In an environment such as your test suite, which stands between you and shipping all of your code, you can't possibly tell me you're okay with consuming more resources than necessary.

Running a single shared example

At the end of a dev cycle, we run a test that covers the corresponding changes. If green, we step back and loop in more tests until the whole test suite is green.

You might think extracting multiple examples via shared_examples could make sense sometimes. For example, testing a success and a failure case under several conditions.

RSpec.shared_examples "foo" do
  it "does this and that" do
    expect(something).to eq(:bar)
  end

  it "does another thing" do
    expect(something).to eq(:baz)
  end
end

describe "FoosController" do
  it_behaves_like "foo"
end

$ bin/rspec spec/controllers/foos_controller_spec.rb:2

  1) Big Failure #1
     Failure/Error: one serious failure
     # spec/controllers/foos_controller_spec.rb:79
     # -e:1:in `<main>'

  2) Big Failure #2
     Failure/Error: ah maaan, another failure
     # spec/controllers/foos_controller_spec.rb:79
     # -e:1:in `<main>'

2/2 |====================== 100 =======================>| Time: 00:00:04

Finished in 4.14 seconds (files took 0.27248 seconds to load)
2 examples, 2 failures

$ wtf

Whether it's your tooling or a direct command, the spec runs on a line number. Within a shared_examples block, you'll run all examples in the set at once. The noise created when some pass and some fail is distracting and confusing when working against a single point of failure.

Duplicated testing

Nobody wants to over-test. A particular code path is either tested or not -- we don't go overboard just in the name of "better safe than sorry."

In my experience, often times an overlap in testing nullifies the DRY question. The classic example is DRYed setup data used in testing both a skinny controller and the service called by that controller, when in reality the service only needs the data and the controller needs to do nothing but verify the right methods are called.

If you have a few large blocks of repeated code, probe through to form an understanding of what exactly is covered. Then, try carving away the excess to tailor each bit of setup to the actual requirements of the test.

Documentation

An added bonus to more verbose specs is that they can serve as documentation. Remember that the word "spec" stands for specification. Your colleagues can easily understand your tests to determine cause and effect, but can only do so if you utilized RSpec's remarkably human-readable DSL. Following a code path through too many levels of abstraction is about as useful as reading the actual implementation.

General Tips

Write test examples in a visually grouped 3-section pattern of (1) creating setup data, (2) calling the code you're testing, and (3) asserting against the result.
Avoid DRYing the actual call to the source code (2nd section) as much as possible. By nature it is usually short and uncomplicated. Nobody should have to hunt for the actual usage, as it's a great way for others to quickly learn how to call it.
Extract out items that have to do with environmental setup, such as current user authentication or anything that makes an HTTP request. Look to shared_context or simple spec helpers.
Try to define postdata for every example in a controller test, and employ FactoryBot.attributes_for to reduce duplication. Extract large amounts of postdata but in a way that is modular.
Reserve subject for the result of some action, and keep subject definitions close to where it's used for each example. subject should be idempotently callable.
Let identical assertions remain. They define the results of particular test cases, and are meaningful in their own right. Changing the assertions of one shouldn't imply a change in those of the other.

Bad:

describe 'token verification' do
  shared_examples 'invalid token' do |expected_error|
    it 'responds with the expected error' do
      response_json = JSON.parse(response.body)

      expect(response).to be_forbidden
      expect(response_json.fetch("error").to eq expected_error
    end
  end

  subject { get :test }

  context 'missing header' do
    include_examples 'invalid token', 'Authorization header not found'
  end

  context 'bad header' do
    before { request.headers.add('Authorization', 'badvalue') }
    include_examples 'invalid token', 'Incorrect authorization token'
  end
end

Good:

describe 'token verification' do
  let(:response_json) { JSON.parse(response.body) }

  it 'requires an authorization header' do
    get :test
    expect(response).to be_forbidden
    expect(response_json.fetch("error").to eq 'Authorization header not found'
  end

  it 'returns an error with a bad authorization token' do
    request.headers.add('Authorization', 'badvalue')

    get :test
    expect(response).to be_forbidden
    expect(response_json.fetch("error").to eq 'Incorrect authorization token'
  end
end

I'll leave you with a quote from Sandi Metz (source):

DRYing out ‘concepts’ exposes sameness and enables reuse; DRYing ‘incidental duplication’ lies about sameness and impedes change.

If you have read nothing else, this is the takeaway: it's really about creating the right abstraction.

For example, factory data is DRYed concepts. Lots of thought goes into the design of FactoryBot's API, as well as your application's definitions. This creates the correct abstraction.

I don't claim to have the exact formula to produce it, because there isn't one. It's the art side of the science. Just keep asking yourself if you're writing something truly understandable by your teammates that isn't just DRYing incidental duplication.

Otherwise, I'm betting on lots of needless pain along with the next substantial change.