╔═════════════════════════════════════════════════════════════════════╗
║ Red ───► Green ───►[TCR]───► Refactor ───►[TCR]───► Done ║
║ ║
║ [TCR] = test ──┬── pass ──► commit ──► continue ║
║ └── fail ──► REVERT ──► RETHINK ║
║ ║
║ REVERT means STOP. Don't fix in place. ║
║ RETHINK means try again with SMALLER STEPS OR try a NEW DESIGN ║
╚═════════════════════════════════════════════════════════════════════╝
Controlled bursts. One test, one commit, no sprawl.
╔═════════════════════════════════════════════════════════════════════╗
║ ║
║ BURST ───► COMMIT ───► BURST ───► COMMIT ───► DONE ║
║ ║
║ Each burst: one failing test → minimal code → commit ║
║ ║
║ No burst = no code. ║
║ ║
╚═════════════════════════════════════════════════════════════════════╝
You operate in controlled bursts.
Do not write large amounts of code at once. Each burst is one failing test → minimal passing code → commit. Stop. Wait. Next burst.
Like ketchup from a bottle: tap, controlled amount, stop. Never upend and pour.
The smallest unit of deliverable work. One test, one behavior, one commit.
Each burst must be:
- Independent — works on its own
- Valuable — adds real functionality
- Small — one test, one behavior
- Testable — single clear assertion
- Reviewable — the human operator can verify correctness at a glance
The constraint is scope, not time. Keep bursts small enough that:
- You stay focused on one thing
- The operator can verify quickly
- A revert loses minimal work
Planning:
- Break the feature into bursts: independent, valuable, small, testable
- For each burst define: value added, code surface size (S/M/L), complexity (S/M/L)
- 1-2 line approach description per burst
- Create
ketchup-plan.mdwith sections: TODO / DONE - Commit the plan before any code
# Ketchup Plan: [Feature Name]
## TODO
- [ ] Burst 1: [1-2 line description]
- [ ] Burst 2: [1-2 line description]
- [ ] Burst 3: [1-2 line description]
## DONE
- [x] (completed bursts move here with commit hash)Execution:
- Write ONE failing test (Red)
- Write MINIMAL code to pass (Green)
- TCR:
test && commit || revert(include plan update) - Refactor if needed
- TCR:
test && commit || revert - Move burst to DONE in
ketchup-plan.md, commit with TCR - Next burst
TCR: Test && Commit || Revert
╔══════════════════════════════════════════════════════════════════════╗
║ TCR is not optional. It's how you operate. ║
║ ║
║ Pass → commit → continue ║
║ Fail → REVERT → STOP → RETHINK ║
╚══════════════════════════════════════════════════════════════════════╝
Why Revert Instead of Fix?
When a test fails, your instinct will be to patch the code until it passes. Do not do this.
Patching builds on broken foundations. Each fix adds complexity to compensate for a flawed approach. You end up with:
- Defensive code that handles symptoms, not causes
- Tangled logic that's hard to reason about
- Technical debt baked into the design
Revert forces a clean slate. The code disappears, but the learning remains. You now know:
- What approach didn't work
- What edge case you missed
- What assumption was wrong
Use that knowledge to design a better approach, not to patch a bad one.
The RETHINK Step:
After a revert, do not immediately retry the same approach. Stop and ask:
- Was the burst too big? → Split it smaller
- Was the design flawed? → Try a different approach
- Was the test wrong? → Clarify the requirement first
Only then write the next failing test.
TCR Command:
pnpm test --run && \
git add -A && git commit -m "<COMMITIZEN FORMAT>" || \
git checkout -- packages/<package>/The Rule: No burst, no code. Every line of production code starts with a failing test in a planned burst.
All thresholds enforced at 100%. Tests drive the code.
| Do | Don't |
|---|---|
| Let tests drive all code | Write code without a failing test first |
| Add branches only when tested | Defensive ??, ?:, if/else untested |
| Test all error paths | Leave error handling unverified |
| Remove dead code after each run | Keep unused code "just in case" |
NEVER Exclude Files to Dodge Coverage
╔══════════════════════════════════════════════════════════════════════╗
║ Excluding a file from coverage to avoid testing it is FORBIDDEN. ║
║ "Hard to test" code (file system, dynamic imports, network, etc.) ║
║ → Inject dependencies and mock them. That's what mocks are for. ║
╚══════════════════════════════════════════════════════════════════════╝
Coverage exclusions are ONLY for:
- Type-only files (interfaces, type definitions)
- Barrel exports (index.ts re-exports)
- Files that are genuinely 0% logic (pure declarations)
If you're tempted to exclude a file because it's "infrastructure" or "integration-focused" — STOP. The coverage requirement exists precisely to force testable design.
╔═══════════════════════════════════════════════════════════════════════╗
║ Start with BEHAVIOR. Always. ║
║ Types, interfaces, data structures — these emerge to serve behavior. ║
║ If it doesn't affect runtime outcomes, it's not the design. ║
╚═══════════════════════════════════════════════════════════════════════╝
The Core Insight:
Software exists to do things. A program that compiles perfectly but produces wrong outputs is worthless. A program with ugly types that produces correct outputs is valuable. This tells us where to focus.
Behavior is the only thing that matters at runtime. Everything else — types, interfaces, abstractions, patterns — exists solely to help us write correct behavior more reliably.
The Wrong Way (Structure-First):
1. Design the data shapes / types / interfaces
2. Define the abstractions and relationships
3. Finally, implement the actual behavior
This is backwards. You're making decisions about shape before you understand what transformations you need. You're building a warehouse before you know what you're storing.
The Right Way (Behavior-First):
1. What OUTCOME do I need? (input → output)
2. Write a test that asserts on that outcome
3. Implement the simplest code that produces the outcome
4. Let structure emerge to support the behavior
Why This Works:
- You only create structure that's actually needed
- Tests verify behavior, which implicitly validates any supporting structure
- You discover the real shape of data by using it, not by speculating
- Refactoring is safe because behavior is locked in by tests
Practical Rule:
Your first test for any feature should call a function and assert on its output. Whatever types, interfaces, or data shapes that function needs will get created because the function needs them — not before, not separately, not speculatively.
// ❌ WRONG: Starting with structure
// "Create User type"
// "Create UserService interface"
// "Design the repository pattern"
// ✅ RIGHT: Starting with behavior
it("creates user with generated id", () => {
const result = createUser({ name: "Alice" });
expect(result).toEqual({ id: expect.any(String), name: "Alice" });
});
// Types, interfaces, whatever — they emerge because this function needs themThe Mantra:
"What should this function return when given this input?"
If you can't answer that question, you're not ready to write code. If you can answer it, write the test first, then make it pass. The rest takes care of itself.
The it('should...') title defines what you're testing. Body proves exactly that.
// ✅ Title matches body
it("should reject empty usernames", () => {
const result = validate({ username: "" });
expect(result.valid).toBe(false);
});
// ❌ Body does more than title claims
it("should reject empty usernames", () => {
const result = validate({ username: "" });
expect(result.valid).toBe(false);
expect(result.errors).toContain("Username required"); // second spec
expect(logger.warn).toHaveBeenCalled(); // third spec
});Use deterministic stubs. Mock at boundaries only when stubs aren't possible.
// ✅ Deterministic stub
function createTestIdGenerator() {
let counter = 0;
return () => `test-id-${counter++}`;
}
it("creates user with generated id", () => {
const generateId = createTestIdGenerator();
const user = createUser({ name: "Alice" }, generateId);
expect(user).toEqual({ id: "test-id-0", name: "Alice" });
});
// ❌ Mocking couples tests to implementation
it("creates user with generated id", () => {
const mockGenerateId = vi.fn().mockReturnValue("user-123");
const user = createUser({ name: "Alice" }, mockGenerateId);
expect(mockGenerateId).toHaveBeenCalled();
});Catch structural changes. Don't cherry-pick properties.
// ✅ Catches any unexpected changes
expect(result).toEqual({ type: "USER", name: "Alice", processed: true });
// ❌ Misses if extra properties added/removed
expect(result.type).toBe("USER");
expect(result.processed).toBe(true);All tests should look identical when you squint: SETUP → EXECUTE → VERIFY
// ✅ Single clear structure
it("transforms user to uppercase", () => {
const input = { type: "user" }; // SETUP
const result = transform(input); // EXECUTE
expect(result).toEqual({ type: "USER" }); // VERIFY
});
// ❌ Multiple execute/verify = split into separate tests
it("does too many things", () => {
const result = transform({ type: "user" });
expect(result.type).toBe("USER");
const updated = updateResult(result); // second execute
expect(updated.modified).toBe(true); // second verify
});╔════════════════════════════════════════════════════════════════════════╗
║ Tests verify OBSERVABLE BEHAVIOR, not INTERNAL STATE. ║
║ If a test calls `.getCount()`, `.size`, or peeks into a Map/Set, ║
║ it's testing implementation — not behavior. ║
╚════════════════════════════════════════════════════════════════════════╝
Tests coupled to internal state break when you refactor. Tests verifying behavior remain stable as implementation evolves.
The Litmus Test:
"If I changed how this is stored internally, would this test still pass?"
If no, the test is coupled to implementation.
Forbidden Patterns:
// ❌ Peeking at internal counts
expect(tracker.getActiveInstanceCount()).toBe(0);
expect(manager.clientCount).toBe(3);
expect(executor.getActiveSessionCount()).toBe(1);
// ❌ Accessing internal collections
expect(service['internalMap'].size).toBe(2);
expect(handler.pendingItems.length).toBe(0);
// ❌ Testing cleanup via size checks
tracker.cleanup();
expect(tracker.getCount()).toBe(0);Allowed Patterns:
// ✅ Test via callbacks/events
const events: Event[] = [];
tracker.onEvent((e) => events.push(e));
tracker.process(input);
expect(events).toEqual([{ type: 'Completed', data: { id: 'x' } }]);
// ✅ Test via dispatched commands
const dispatched: Command[] = [];
executor.onDispatch((cmd) => dispatched.push(cmd));
executor.start({ items: ['a', 'b'] });
expect(dispatched).toEqual([
{ type: 'ProcessItem', data: { id: 'a' } },
{ type: 'ProcessItem', data: { id: 'b' } },
]);
// ✅ Test via boolean query APIs
expect(tracker.isWaitingFor('correlationId', 'HandlerA')).toBe(true);
expect(tracker.isPending('key')).toBe(false);
// ✅ Test cleanup via re-triggering behavior
tracker.complete('c1');
tracker.start('c1'); // Same id reused
expect(tracker.isActive('c1')).toBe(true); // Proves cleanup workedWhat to Test vs. What NOT to Test:
| Test This (Observable) | Not This (Internal) |
|---|---|
| Events/callbacks fired | Internal array lengths |
| Commands dispatched | Map/Set sizes |
| Return values | Private field values |
| Query API responses | Collection contents |
| Side effects (HTTP, logs) | Internal state mutations |
Testing Cleanup Without State Peeking:
// ❌ WRONG: Testing cleanup via internal count
it('cleans up after completion', () => {
tracker.start('c1');
tracker.complete('c1');
expect(tracker.getActiveCount()).toBe(0);
});
// ✅ RIGHT: Testing cleanup via re-triggering
it('allows new session after completion', () => {
const completed: string[] = [];
tracker.onComplete((id) => completed.push(id));
tracker.start('c1');
tracker.complete('c1');
tracker.start('c1'); // Re-use same id
tracker.complete('c1');
expect(completed).toEqual(['c1', 'c1']); // Fired twice = cleanup worked
});Do NOT expose methods solely for testing:
// ❌ WRONG: Adding getters just for tests
class SessionManager {
private sessions = new Map();
getSessionCount(): number { return this.sessions.size; } // Test-only method
}
// ✅ RIGHT: Test via behavior, not state inspection
class SessionManager {
private sessions = new Map();
// No count getter — test by observing callbacks/events
}