🥫 The Ketchup Technique

Controlled Bursts + TDD + TCR + 100% Coverage

╔═════════════════════════════════════════════════════════════════════╗
║   Red ───► Green ───►[TCR]───► Refactor ───►[TCR]───► Done          ║
║                                                                     ║
║   [TCR] = test ──┬── pass ──► commit ──► continue                   ║
║                  └── fail ──► REVERT ──► RETHINK                    ║
║                                                                     ║
║   REVERT means STOP. Don't fix in place.                            ║
║   RETHINK means try again with SMALLER STEPS OR try a NEW DESIGN    ║
╚═════════════════════════════════════════════════════════════════════╝

💥 Controlled Bursts

Controlled bursts. One test, one commit, no sprawl.

╔═════════════════════════════════════════════════════════════════════╗
║                                                                     ║
║   BURST ───► COMMIT ───► BURST ───► COMMIT ───► DONE                ║
║                                                                     ║
║   Each burst: one failing test → minimal code → commit              ║
║                                                                     ║
║   No burst = no code.                                               ║
║                                                                     ║
╚═════════════════════════════════════════════════════════════════════╝

Why Ketchup?

You operate in controlled bursts.

Do not write large amounts of code at once. Each burst is one failing test → minimal passing code → commit. Stop. Wait. Next burst.

Like ketchup from a bottle: tap, controlled amount, stop. Never upend and pour.

What's a Burst?

The smallest unit of deliverable work. One test, one behavior, one commit.

Each burst must be:

Independent — works on its own
Valuable — adds real functionality
Small — one test, one behavior
Testable — single clear assertion
Reviewable — the human operator can verify correctness at a glance

The constraint is scope, not time. Keep bursts small enough that:

You stay focused on one thing
The operator can verify quickly
A revert loses minimal work

🥫 Ketchup Workflow

Planning:

Break the feature into bursts: independent, valuable, small, testable
For each burst define: value added, code surface size (S/M/L), complexity (S/M/L)
1-2 line approach description per burst
Create ketchup-plan.md with sections: TODO / DONE
Commit the plan before any code

# Ketchup Plan: [Feature Name]

## TODO
- [ ] Burst 1: [1-2 line description]
- [ ] Burst 2: [1-2 line description]
- [ ] Burst 3: [1-2 line description]

## DONE
- [x] (completed bursts move here with commit hash)

Execution:

Write ONE failing test (Red)
Write MINIMAL code to pass (Green)
TCR: test && commit || revert (include plan update)
Refactor if needed
TCR: test && commit || revert
Move burst to DONE in ketchup-plan.md, commit with TCR
Next burst

TCR: Test && Commit || Revert

╔══════════════════════════════════════════════════════════════════════╗
║  TCR is not optional. It's how you operate.                          ║
║                                                                      ║
║  Pass → commit → continue                                            ║
║  Fail → REVERT → STOP → RETHINK                                      ║
╚══════════════════════════════════════════════════════════════════════╝

Why Revert Instead of Fix?

When a test fails, your instinct will be to patch the code until it passes. Do not do this.

Patching builds on broken foundations. Each fix adds complexity to compensate for a flawed approach. You end up with:

Defensive code that handles symptoms, not causes
Tangled logic that's hard to reason about
Technical debt baked into the design

Revert forces a clean slate. The code disappears, but the learning remains. You now know:

What approach didn't work
What edge case you missed
What assumption was wrong

Use that knowledge to design a better approach, not to patch a bad one.

The RETHINK Step:

After a revert, do not immediately retry the same approach. Stop and ask:

Was the burst too big? → Split it smaller
Was the design flawed? → Try a different approach
Was the test wrong? → Clarify the requirement first

Only then write the next failing test.

TCR Command:

pnpm test --run && \
  git add -A && git commit -m "<COMMITIZEN FORMAT>" || \
  git checkout -- packages/<package>/

The Rule: No burst, no code. Every line of production code starts with a failing test in a planned burst.

🎯 TDD + 100% Coverage

All thresholds enforced at 100%. Tests drive the code.

Do	Don't
Let tests drive all code	Write code without a failing test first
Add branches only when tested	Defensive `??`, `?:`, `if/else` untested
Test all error paths	Leave error handling unverified
Remove dead code after each run	Keep unused code "just in case"

NEVER Exclude Files to Dodge Coverage

╔══════════════════════════════════════════════════════════════════════╗
║  Excluding a file from coverage to avoid testing it is FORBIDDEN.    ║
║  "Hard to test" code (file system, dynamic imports, network, etc.)   ║
║  → Inject dependencies and mock them. That's what mocks are for.     ║
╚══════════════════════════════════════════════════════════════════════╝

Coverage exclusions are ONLY for:

Type-only files (interfaces, type definitions)
Barrel exports (index.ts re-exports)
Files that are genuinely 0% logic (pure declarations)

If you're tempted to exclude a file because it's "infrastructure" or "integration-focused" — STOP. The coverage requirement exists precisely to force testable design.

🧠 Design Philosophy

Behavior First, Everything Else Emerges

╔═══════════════════════════════════════════════════════════════════════╗
║  Start with BEHAVIOR. Always.                                         ║
║  Types, interfaces, data structures — these emerge to serve behavior. ║
║  If it doesn't affect runtime outcomes, it's not the design.          ║
╚═══════════════════════════════════════════════════════════════════════╝

The Core Insight:

Software exists to do things. A program that compiles perfectly but produces wrong outputs is worthless. A program with ugly types that produces correct outputs is valuable. This tells us where to focus.

Behavior is the only thing that matters at runtime. Everything else — types, interfaces, abstractions, patterns — exists solely to help us write correct behavior more reliably.

The Wrong Way (Structure-First):

1. Design the data shapes / types / interfaces
2. Define the abstractions and relationships
3. Finally, implement the actual behavior

This is backwards. You're making decisions about shape before you understand what transformations you need. You're building a warehouse before you know what you're storing.

The Right Way (Behavior-First):

1. What OUTCOME do I need? (input → output)
2. Write a test that asserts on that outcome
3. Implement the simplest code that produces the outcome
4. Let structure emerge to support the behavior

Why This Works:

You only create structure that's actually needed
Tests verify behavior, which implicitly validates any supporting structure
You discover the real shape of data by using it, not by speculating
Refactoring is safe because behavior is locked in by tests

Practical Rule:

Your first test for any feature should call a function and assert on its output. Whatever types, interfaces, or data shapes that function needs will get created because the function needs them — not before, not separately, not speculatively.

// ❌ WRONG: Starting with structure
// "Create User type"
// "Create UserService interface"
// "Design the repository pattern"

// ✅ RIGHT: Starting with behavior
it("creates user with generated id", () => {
  const result = createUser({ name: "Alice" });
  expect(result).toEqual({ id: expect.any(String), name: "Alice" });
});
// Types, interfaces, whatever — they emerge because this function needs them

The Mantra:

"What should this function return when given this input?"

If you can't answer that question, you're not ready to write code. If you can answer it, write the test first, then make it pass. The rest takes care of itself.

📝 Testing Guidelines

Test Title = Spec

The it('should...') title defines what you're testing. Body proves exactly that.

// ✅ Title matches body
it("should reject empty usernames", () => {
  const result = validate({ username: "" });
  expect(result.valid).toBe(false);
});

// ❌ Body does more than title claims
it("should reject empty usernames", () => {
  const result = validate({ username: "" });
  expect(result.valid).toBe(false);
  expect(result.errors).toContain("Username required"); // second spec
  expect(logger.warn).toHaveBeenCalled(); // third spec
});

Stubs Over Mocks

Use deterministic stubs. Mock at boundaries only when stubs aren't possible.

// ✅ Deterministic stub
function createTestIdGenerator() {
  let counter = 0;
  return () => `test-id-${counter++}`;
}

it("creates user with generated id", () => {
  const generateId = createTestIdGenerator();
  const user = createUser({ name: "Alice" }, generateId);
  expect(user).toEqual({ id: "test-id-0", name: "Alice" });
});

// ❌ Mocking couples tests to implementation
it("creates user with generated id", () => {
  const mockGenerateId = vi.fn().mockReturnValue("user-123");
  const user = createUser({ name: "Alice" }, mockGenerateId);
  expect(mockGenerateId).toHaveBeenCalled();
});

Assert Whole Objects

Catch structural changes. Don't cherry-pick properties.

// ✅ Catches any unexpected changes
expect(result).toEqual({ type: "USER", name: "Alice", processed: true });

// ❌ Misses if extra properties added/removed
expect(result.type).toBe("USER");
expect(result.processed).toBe(true);

Squint Test

All tests should look identical when you squint: SETUP → EXECUTE → VERIFY

// ✅ Single clear structure
it("transforms user to uppercase", () => {
  const input = { type: "user" }; // SETUP
  const result = transform(input); // EXECUTE
  expect(result).toEqual({ type: "USER" }); // VERIFY
});

// ❌ Multiple execute/verify = split into separate tests
it("does too many things", () => {
  const result = transform({ type: "user" });
  expect(result.type).toBe("USER");
  const updated = updateResult(result); // second execute
  expect(updated.modified).toBe(true); // second verify
});

Behavioral Testing (No State Peeking)

╔════════════════════════════════════════════════════════════════════════╗
║  Tests verify OBSERVABLE BEHAVIOR, not INTERNAL STATE.                  ║
║  If a test calls `.getCount()`, `.size`, or peeks into a Map/Set,      ║
║  it's testing implementation — not behavior.                            ║
╚════════════════════════════════════════════════════════════════════════╝

Tests coupled to internal state break when you refactor. Tests verifying behavior remain stable as implementation evolves.

The Litmus Test:

"If I changed how this is stored internally, would this test still pass?"

If no, the test is coupled to implementation.

Forbidden Patterns:

// ❌ Peeking at internal counts
expect(tracker.getActiveInstanceCount()).toBe(0);
expect(manager.clientCount).toBe(3);
expect(executor.getActiveSessionCount()).toBe(1);

// ❌ Accessing internal collections
expect(service['internalMap'].size).toBe(2);
expect(handler.pendingItems.length).toBe(0);

// ❌ Testing cleanup via size checks
tracker.cleanup();
expect(tracker.getCount()).toBe(0);

Allowed Patterns:

// ✅ Test via callbacks/events
const events: Event[] = [];
tracker.onEvent((e) => events.push(e));
tracker.process(input);
expect(events).toEqual([{ type: 'Completed', data: { id: 'x' } }]);

// ✅ Test via dispatched commands
const dispatched: Command[] = [];
executor.onDispatch((cmd) => dispatched.push(cmd));
executor.start({ items: ['a', 'b'] });
expect(dispatched).toEqual([
  { type: 'ProcessItem', data: { id: 'a' } },
  { type: 'ProcessItem', data: { id: 'b' } },
]);

// ✅ Test via boolean query APIs
expect(tracker.isWaitingFor('correlationId', 'HandlerA')).toBe(true);
expect(tracker.isPending('key')).toBe(false);

// ✅ Test cleanup via re-triggering behavior
tracker.complete('c1');
tracker.start('c1'); // Same id reused
expect(tracker.isActive('c1')).toBe(true); // Proves cleanup worked

What to Test vs. What NOT to Test:

Test This (Observable)	Not This (Internal)
Events/callbacks fired	Internal array lengths
Commands dispatched	Map/Set sizes
Return values	Private field values
Query API responses	Collection contents
Side effects (HTTP, logs)	Internal state mutations

Testing Cleanup Without State Peeking:

// ❌ WRONG: Testing cleanup via internal count
it('cleans up after completion', () => {
  tracker.start('c1');
  tracker.complete('c1');
  expect(tracker.getActiveCount()).toBe(0);
});

// ✅ RIGHT: Testing cleanup via re-triggering
it('allows new session after completion', () => {
  const completed: string[] = [];
  tracker.onComplete((id) => completed.push(id));

  tracker.start('c1');
  tracker.complete('c1');

  tracker.start('c1'); // Re-use same id
  tracker.complete('c1');

  expect(completed).toEqual(['c1', 'c1']); // Fired twice = cleanup worked
});

Do NOT expose methods solely for testing:

// ❌ WRONG: Adding getters just for tests
class SessionManager {
  private sessions = new Map();
  getSessionCount(): number { return this.sessions.size; } // Test-only method
}

// ✅ RIGHT: Test via behavior, not state inspection
class SessionManager {
  private sessions = new Map();
  // No count getter — test by observing callbacks/events
}

SamHatoum/claude.md

Select an option

No results found