Test Doubles Overview
- Stand-ins for real implementations in tests (like stunt doubles)
- Three types: fakes, stubbing, interaction testing
- Google learned the hard way: overusing mocking frameworks creates maintenance nightmares with few bug finds
- Practices vary widely across Google teams
Making Code Testable
- Testability requires upfront investment (harder to retrofit later)
- Use dependency injection to create "seams" for test doubles
- Mocking frameworks reduce boilerplate but come with major caveats
Prefer Real Implementations
- First choice: use real implementations (same as production)
- "Classical testing" vs "mockist testing" - Google found classical scales better
- Real implementations give confidence; test doubles isolate but don't prove correctness
- Use real implementations when: fast, deterministic, simple dependencies
- Trade-offs to consider: execution time, determinism/flakiness, dependency construction complexity
Fakes: The Best Test Double
- Lightweight implementation behaving like real thing (e.g., in-memory database)
- Single fake can radically improve testing experience across organization
- Must maintain fidelity to API contracts (same inputs → same outputs)
- Fakes need their own tests (contract tests against real implementation)
- Team owning real implementation should own the fake
- If no fake exists: ask owners to create one, write your own wrapper, or use real implementation
Stubbing: Use Sparingly
- Quick way to hardcode return values inline
- Dangers: tests become unclear (extra code obscures intent), brittle (leaks implementation details), less effective (no fidelity guarantee, can't store state)
- Appropriate use: when you need specific return value to reach certain state, and each stub directly relates to assertions
- Still prefer fakes/real implementations even when stubbing seems appropriate
Interaction Testing: Avoid When Possible
- Validates function calls without executing them
- Problems: can't prove system works (only that functions were called), exposes implementation details ("change-detector tests")
- Appropriate when: can't do state testing (no real implementation/fake), or call count/order matters (e.g., caching)
- Best practices: only for state-changing functions (sendEmail, not getUser), avoid overspecification (use any() for irrelevant args)
- Not a replacement for state testing - supplement with integration tests
Key Principles
- Prefer real implementations > fakes > stubbing > interaction testing
- Test behavior through state, not through validating internal calls
- No exact answers - engineer judgment and trade-offs required
- Eventually need larger-scope tests to exercise real dependencies