Fixing a State-Related Random RSpec Failure
Today I fixed a random test failure in RSpec. An Active Record callback was being fired despite an if: :bla
condition that returned false
, causing a not_to
method expectation to fail. Futhermore, the example passed when running the example directly. It was only when running the whole test suite that it failed, sometimes.
These types of failures are a huuuge pain in the ass, so I wanted to document my thought process and the steps I took to fix it, to hopefully get my future self and anyone reading this a few steps ahead next time it happens.
First thing we know: it’s random. What is random in our test suite? The first thing that pops to mind is the order of examples. There definitely could be other random pieces too. For example, at one point we randomly assigned made-up names to a title attribute on out of our factories. That caused a spec failure occasionally on a uniqueness constraint. We quickly did away with it. I had a pretty good hunch it was the order of examples, a good starting point anyway.
Based on that, I decided to use the
--seed
option of RSpec to try to reproduce the failure. I looked to one of our random failures and pulled the seed #, which determines the order of examples. Running the full 7-minute suite with the seed passed in, I saw the failure every time. Yess!Even though I reproduced it at this point, running the whole suite for every cycle of trying was too slow. The spec file in question was a model spec, so I tried running just model specs via
rspec spec/models/ --seed #
. A bit of a lucky break, but this still caused the failure every time. That reduced the cycle to only about 20 seconds, and further isolated the issue to a small subset of specs. Great!I then put in a
fail
call in thebla
method that determined if the callback ran (if: :bla
). Running the suite again, I saw noRuntimeError
, meaning thatfail
did not get called. This told me something else was calling the Active Record callback in question. Whaaaat? How could that be?Knowing that some spec executed before this one somehow messed with this callback, the only question was which one. I produced a list of the spec file ordering for that particular seed by running
rspec spec/models/ --seed # -f json > /tmp/out.json
, and then parsing through it via ruby:j = JSON.parse(File.read('/tmp/out.json')).with_indifferent_access j[:examples].map { |a| a[:file_path] }.uniq
Armed with the list of files, I opened each file, deleted the contents, and re-ran model specs only. It only took a few tries before the original spec passed as I looked first to specs could be related to the Active Record model in question, or specs that are big/complex. It turned out to be a spec helper that we'd written to temporarily skip this callback:
TheModel.skip_callback(:validation, :after, :the_callback) yield TheModel.set_callback(:validation, :after, :the_callback)
That's it! Any examples that test the_callback
will mostly fail after running this bit of a hackjob as set_callback
will redefine the callback without the original if: :bla
condition.
Happy friday!