Blog

Better Specs: Testing Guidelines for Developers

September 3, 2025

A healthy codebase relies on the robust foundation of a comprehensive test suite. Large, evolving codebases—those that grow over years of development with countless features—present unique challenges in maintaining code integrity and ensuring smooth evolution. A well-crafted test suite is the cornerstone of this effort, acting as a safety net against regressions and a guide for future modifications. However, building and maintaining such a suite requires careful attention to several key principles.

Reliability / Reproducibility

Flaky tests—those that sometimes pass and sometimes fail without any code changes—are a significant problem in software development. They undermine confidence in the test suite and can mask real issues. The primary goal is to ensure that tests yield consistent and predictable results.

Avoid using .last, .first, or .count

When tests run in parallel and share the same database, these methods can create race conditions and unpredictable behavior because they rely on an order that can change between test runs.

❌ Don't

1it "creates a user record" do
2  User.create(name: "test")
3  # PROBLEM: This could return ANY user from ANY test
4  user = User.last # Flaky
5  # If another test creates a user after this line but before the assertion,
6  # this test will fail unpredictably
7  assert_equal user.name, "test"
8end

Why this is bad:

  • Race conditions: In parallel test execution, another test might create a user between your User.create and User.last calls
  • Database state dependency: The test relies on the assumption that your user will be the "last" one, which is never guaranteed
  • Non-deterministic behavior: The same test code can produce different results depending on timing and other tests running concurrently
  • Difficult debugging: When this test fails, it's hard to determine why because the failure depends on external factors

✅ Do

1it "creates a user record" do
2  user = User.create(name: "test")
3  # SOLUTION: Test the exact object you created
4  # This is deterministic and isolated from other tests
5  assert_equal user.name, "test"
6end

Why this is good:

  • Direct reference: You're testing the exact object you created, not searching for it in the database
  • No race conditions: Other tests can't interfere with your specific user object
  • Predictable: The test will always behave the same way, regardless of other tests

Time Constraints

Time-based logic creates tests that pass or fail depending on when they are run, making them unreliable. If your logic relies on time, it's better to freeze time to create predictable conditions.

❌ Don't

1it "returns age" do
2  user = Fabricate(:user, birth_date: 29.years.ago)
3  assert_equal user.age, 29 # PROBLEM: This will fail on the user's birthday!
4  # Also fails if the test runs at exactly midnight on the birthday
5end

Why this is bad:

  • Birthday boundary issues: If the test runs on the user's birthday, their age might be 30 instead of 29
  • Future brittleness: This test will begin to fail next year when the user created "29 years ago" becomes 30 years old
  • Time zone issues: Different server time zones can affect age calculations

✅ Do

1it "returns age" do
2  user = Fabricate(:user, birth_date: "2000-01-01")
3  # SOLUTION: Freeze time to create predictable, repeatable conditions
4  Timecop.freeze(Time.local(2024, 12, 31, 0, 0, 0)) do
5    assert_equal user.age, 24
6  end
7  Timecop.freeze(Time.local(2025, 1, 1, 0, 0, 0)) do
8    assert_equal user.age, 25
9  end
10end

Why this is good:

  • Controlled time: Timecop.freeze eliminates time-based variability
  • Edge case testing: You can test specific scenarios, like the transition over a birthday
  • Future-proof: The test will work the same way regardless of when you run it

Test Coverage

Incomplete test coverage leaves edge cases untested, leading to production bugs. Test cases should cover expected successes ("happy path"), expected failures, and edge cases.

✅ Do

1describe "#withdraw" do
2  let(:balance) { Fabricate(:balance, amount: 1000) }
3  subject { balance.withdraw(amount: withdrawal) }
4
5  # GOOD: Testing the happy path
6  context "when withdraw money" do
7    let(:withdrawal) { 1000 }
8    it "can withdraw money" do
9      expect { subject }.to change { balance.amount }.by(-1000)
10    end
11  end
12
13  # GOOD: Testing expected failure scenarios
14  context "when withdraw money more than the remaining" do
15    let(:withdrawal) { 2000 }
16    it "can't withdraw money" do
17      expect { subject }.to raise_error(ActiveRecord::RecordInvalid)
18    end
19  end
20
21  # GOOD: Testing edge cases and input validation
22  context "when withdrawal amount is negative" do
23    let(:withdrawal) { -1000 }
24    it "can't withdraw money" do
25      expect { subject }.to raise_error(ActiveRecord::RecordInvalid)
26    end
27  end
28end

Why this is good:

  • Happy path testing: Verifies the method works correctly under normal conditions
  • Boundary testing: Tests what happens when you try to withdraw more than is available
  • Input validation: Ensures invalid inputs, such as negative amounts, are properly rejected
  • Error documentation: Shows exactly what errors are expected and when

Readability / Clarity

Poorly structured tests are difficult to understand, maintain, and debug. When a test fails, developers should immediately understand what functionality is broken and under what conditions. Use 'describe,' 'context,' and 'it' to provide a clearer explanation.

  • describe: Groups related tests and identifies what is being tested (e.g., #instance_method or .class_method)
  • context: Describes the specific conditions or state for a test (e.g., "when something specific happens...")
  • it: Describes the expected outcome or behavior (e.g., "it should return the expected value")

❌ Don't

1class BalanceTest < ActiveSupport::TestCase
2  let(:balance) { Fabricate(:balance, amount: 1000) }
3
4  it "can withdraw" do # Vague description: withdraw what? how much?
5    balance.withdraw(100)
6    assert_equal 900, balance.amount
7  end
8
9  it "can deposit" do # No context about conditions or expected behavior
10    balance.deposit(100)
11    assert_equal 1100, balance.amount
12  end
13
14  # MAJOR PROBLEM: This description is way too long and unclear
15  it "can not withdraw a greater amount than the balance amount and throw the error" do
16    expect { balance.withdraw(2000) }.to raise_error(ActiveRecord::RecordInvalid)
17  end
18end

Why this is bad:

  • Unclear test scope: It's hard to tell what method is being tested in each case
  • Missing context: There is no indication of the conditions under which each test runs
  • Vague descriptions: "can withdraw" doesn't specify amounts, conditions, or expected results
  • Poor organization: Related tests for withdrawing are scattered instead of grouped logically

✅ Do

1class BalanceTest < ActiveSupport::TestCase
2  let(:balance) { Fabricate(:balance, amount: 1000) }
3
4  # GOOD: Clearly identifies which method is being tested
5  describe "#withdraw" do
6    # GOOD: Specific context about the test conditions
7    context "when withdrawal amount is less than the balance" do
8      it "withdraws the money" do # GOOD: Clear, specific expectation
9        balance.withdraw(100)
10        assert_equal 900, balance.amount
11      end
12    end
13
14    # GOOD: Different context for a different scenario
15    context "when withdrawal is greater than the balance" do
16      it "raises an error" do # GOOD: Clear about the expected behavior
17        expect { balance.withdraw(2000) }.to raise_error(ActiveRecord::RecordInvalid)
18      end
19    end
20  end
21
22  # GOOD: Separate method gets its own describe block
23  describe "#deposit" do
24    context "when depositing money" do
25      it "increases the balance amount" do
26        balance.deposit(100)
27        assert_equal 1100, balance.amount
28      end
29    end
30  end
31end

Why this is good:

  • Clear organization: Each method (#withdraw, #deposit) has its own describe block
  • Specific contexts: Each context block describes the exact test conditions
  • Easy debugging: Failed tests immediately tell you which method and scenario broke
  • Self-documenting: Anyone can read the test structure and understand the expected behavior of the code

Speed / Performance

Slow tests create a poor developer experience, leading to longer feedback cycles. Developers may skip running tests if they take too long, defeating their purpose.

Database I/O Operations

Unnecessary database operations dramatically slow down test execution. Pay attention to how you construct objects for tests; you may be spending too much time on DB operations.

❌ Don't

1# This setup uses fixtures that must be saved to the database first
2@transfers = [
3  transfers(:visa_card_transfer),      # 1 DB insert
4  transfers(:generic_card_transfer),   # 1 DB insert
5]
6
7@transfer_attempts = @transfers.map do |transfer|
8  transfer.send(:generate_uid)
9  transfer.save # 1 DB update
10  transfer.bank_account.update(name: @long_bank_name, number: @short_bank_account) # 2 DB ops
11  transfer.transfer_attempts.find_by(batch_transfer_id: nil) # 1 DB fetch
12end # 4 DB operations per iteration * 2 items = 8 DB operations
13
14# In total, this section performs 10 DB operations just to prepare test data

Why this is bad:

  • Excessive DB operations: 10 separate database calls just for test setup are very slow
  • Unnecessary persistence: Objects are saved to the database when in-memory objects would suffice
  • Brittle: Changes to the fixture data can break unrelated tests

✅ Do

1# Fabricators can create objects with associations in memory,
2# only hitting the database if explicitly saved.
3@transfer1 = Fabricate(:transfer_visa, uid: "123") do
4  bank_account { Fabricate(:bank_account, name: @long_bank_name, number: @short_bank_account) }
5  transfer_attempts { [Fabricate(:transfer_attempt)] }
6end
7
8@transfer2 = Fabricate(:transfer_live_card, uid: "456") do
9  bank_account { Fabricate(:bank_account, name: @long_bank_name, number: @short_bank_account) }
10  transfer_attempts { [Fabricate(:transfer_attempt)] }
11end
12
13@transfers = [@transfer1, @transfer2]
14
15# This approach saves many DB operations

Why this is good:

  • Fewer DB operations: Fabrication creates objects in memory when persistence isn't needed, which is much faster
  • Self-contained: Each test creates exactly the data it needs without external dependencies
  • Flexible: It's easy to customize objects for specific test scenarios

Mock Out External Dependencies

A good test should focus on one concern at a time. Testing multiple things at once (e.g., an API response and an email delivery) makes the test slow, brittle, and unclear.

❌ Don't

1describe "#create" do
2  it "creates a user and delivers a welcome email" do
3    # PROBLEM: Forces synchronous email processing, which is slow
4    Sidekiq::Testing.inline! do
5      api_post(:create, params: { name: "John" })
6    end
7
8    assert_response 200
9
10    # PROBLEM: Testing the implementation details of email delivery,
11    # which is an external dependency
12    welcome_email = ActionMailer::Base.deliveries.last
13    assert_equal "Welcome John", welcome_email.title
14  end
15end

Why this is bad:

  • Multiple concerns: This tests both user creation AND email delivery in one test
  • Slow execution: Sidekiq::Testing.inline! processes the background job synchronously, adding significant time
  • Brittle: If the email template changes, this API test could break, even if the API itself is working correctly
  • Unclear failures: If the test fails, is it the API's fault or the email system's?

✅ Do

1describe "#create" do
2  subject do
3    api_post(:create, params: { name: "John" })
4  end
5
6  # GOOD: Test focuses solely on the API endpoint behavior
7  it "creates a user" do
8    subject
9    assert_response 200
10  end
11
12  # GOOD: Test verifies that the right job is queued with correct parameters.
13  # The actual email functionality is tested separately in WelcomeMailer specs.
14  it "queues a welcome email worker" do
15    WelcomeMailer.expects(:perform_async).with(name: "John").once
16    subject
17  end
18end

Why this is good:

  • Single responsibility: Each test verifies one specific behavior: one for the API response, one for queueing the job.
  • Fast execution: Mocking the mailer is instantaneous and avoids slow external operations.
  • Isolated: The API tests no longer depend on the email system's functionality.
  • Maintainable: Changes to the email content won't break the API tests.

Extensibility / Maintainability

Tests that are hard to extend become technical debt. Using a data-driven approach, your tests can become a clear specification by example.

✅ Do

1class DivideTest < ActiveSupport::TestCase
2  # GOOD: Comprehensive test cases that serve as a specification
3  SAMPLE_CASES = [
4    { dividend: 100,    divisor: 10,   result: 10 },   # Basic integer division
5    { dividend: 100.0divisor: 10,   result: 10.0 }, # Float dividend
6    { dividend: 5,      divisor: 2,    result: 2 },    # Integer division truncation
7    { dividend: 5.0,    divisor: 2,    result: 2.5 },  # Float division precision
8    { dividend: 0,      divisor: 10,   result: 0 },    # Zero dividend
9    # And so on...
10  ]
11
12  # GOOD: Data-driven tests that are easy to extend
13  SAMPLE_CASES.each do |test_case|
14    it "returns #{test_case[:result]} when dividing #{test_case[:dividend]} by #{test_case[:divisor]}" do
15      assert_equal test_case[:result], Math.divide(test_case[:dividend], test_case[:divisor])
16    end
17  end
18
19  # GOOD: Explicit error case testing
20  context "when divisor is zero" do
21    it "raises a ZeroDivisionError" do
22      assert_raises ZeroDivisionError do
23        Math.divide(100, 0)
24      end
25    end
26  end
27end

Why this is good:

  • Comprehensive coverage: Multiple scenarios are tested systematically.
  • Easy to extend: Adding a new test case only requires adding a new hash to the SAMPLE_CASES array.
  • Self-documenting: The dynamic test names clearly describe the expected behavior for each set of inputs.
  • Maintainable: If the function's logic changes, you only need to update the results in one central place.

Conclusion

Writing better tests isn't just about preventing bugs; it's about building a more professional and sustainable engineering practice. A great test suite is more than a safety net—it's a form of living documentation that enables developers to refactor with courage, add features with confidence, and onboard new team members with ease. By focusing on reliability, readability, and performance, you're not just improving your specs; you're investing in the long-term health and velocity of your entire codebase. Ship with confidence.

Learn More

For additional testing guidelines and best practices, visit Better Specs.