Vibe Coding Best Practices

This document contains practices specifically designed for AI-first development. Browse through it, find the sections that resonate with your workflow, and copy them into your repository for your AI assistant to follow.

The loop descriptions are comprehensive templates—you can copy them directly to create your own loop specifications tailored to your project. The specification examples provide building blocks for creating your own project specs, architecture docs, and other living documentation.

📁 Quick Guide: Where to Save Each Section

When adopting sections from this document, copy them into these suggested spec files:

Operating Loops → OPERATING_LOOPS.md or individual files like CI_GREEN_LOOP.md
Testing Patterns → TESTING_STRATEGY.md or TEST_GUIDELINES.md
Code Standards → CODE_STANDARDS.md or ENGINEERING_PRACTICES.md
AI Practices → AI_DEVELOPMENT.md or CLAUDE.md
Architecture Patterns → ARCHITECTURE.md or TECHNICAL_DECISIONS.md
Process Specs → PROCESS_SPEC.md or DEVELOPMENT_WORKFLOW.md

🎯 The Most Important Thing: Choose Your Loop

When you're vibe coding, you're moving fast with AI assistance. The key to maintaining quality at speed is choosing the right operating loop for your current situation and letting your AI execute it.

Many developers struggle because they're in the wrong loop at the wrong time. They're optimizing performance when CI is broken. They're polishing UI when users can't log in. They're adding features when existing bugs are driving users away.

This document describes 8 different AI-driven operating loops, each designed for a specific situation. Understanding when to use each one—and how to guide your AI through it—helps maintain momentum without sacrificing quality.

📋 What's Covered Here

🔄 8 Operating Loops: AI-driven patterns from 5-minute CI fixes to 90-minute performance deep dives
🤖 AI-First Development: Practices designed for working with AI pair programmers
🏃‍♂️ Loop Transitions: When to switch from one loop to another
🔬 Race Condition Testing: Tools for making non-deterministic bugs reproducible
🛡️ Type Safety: TypeScript settings that prevent entire categories of bugs

🗺️ Quick Navigation

Jump Based on Your Current Situation:

Build is red? → CI Green Loop (5-15 min)
Users complaining? → Bug Investigation Loop (15-45 min)
Feature request? → User Story Loop (30-60 min)
App feels slow? → Performance Loop (45-90 min)
Looking for general improvements? → Browse the practices that fit your workflow

📚 Table of Contents

🚀 Start Here

🎯 The Most Important Thing: Choose Your Loop
📋 What's Covered Here
🗺️ Quick Navigation

🔄 AI-Driven Operating Loops

🟢 The CI Green Loop
🎯 The User Story → Test → UI Loop
🐛 The Bug Investigation Loop
🚀 The Performance Optimization Loop
🎨 The UI Improvement Loop
🧪 The Test-Driven Feature Loop
🔍 The Code Quality Loop
🎯 Loop Selection Strategy
🚀 Meta-Loop: AI Process Optimization
Loop Transition Triggers

💡 Key Concepts

1. Make Bugs Impossible, Not Just Unlikely
2. Turn Race Conditions into Deterministic Bugs
3. AI as Your Fresh-Eyes Test User
4. Building Context Through Documentation
5. The 15-Minute CI Rule
6. Every Bug Should Improve Your Types

⚡ Quick Wins: Implement These in Under 5 Minutes

1. Enable TypeScript's Strictest Setting
2. Set Up AI-Friendly Development
3. Install This Pre-Commit Hook
4. Add VS Code Quick Fix Settings
5. Create Your First Operating Loop Tracker

🌍 Universal Practices (All Languages)

🎯 Core Philosophy
📋 Collaborative Specification Development
🏗️ Environment & Tooling
🧪 Testing Excellence

📱 TypeScript/JavaScript Excellence

Package Management (TypeScript/JavaScript Specific)
Testing Framework Rules (TypeScript/JavaScript Specific)
Test Commands
Functional Programming Rules
Array Safety (Critical for Race Condition Prevention)
TypeScript Configuration for Safety
Branded Types for ID Safety
Result Types for Error Handling
Exhaustive Message Handling

🦀 Rust Excellence

Functional Programming Principles
Rust-Specific Quality Standards
Rust Commands

📋 Code Review Checklist

Array Access Review
Async State Review
Performance Review
Type Safety Review

🐛 Bug-Driven Type Safety Improvement

The Golden Rule: Every Bug is a Type Safety Lesson
Bug Analysis Framework
Real-World Examples from This Project
Bug-Driven Type Safety Checklist
Implementation Pattern: Bug → Type → Test
Type Safety Evolution Log
Advanced Type Safety Patterns
The Type Safety Mindset

🔧 Development Workflow

Standard Development Process
Build Commands

🌶️ Strong Opinions

💥 Lessons from Production

The Most Common Production Failures (And How to Prevent Them)
Action Items for Your Codebase

🚀 AI Pair Programming Best Practices

Working with AI Assistants
Autonomous AI Prompts for Each Loop
AI Autonomy Patterns
AI-Assisted Debugging
Collaborative Development Loop

🎯 Success Metrics

Code Quality
Development Velocity
Collaboration Quality

🔄 Continuous Improvement

Regular Practices
Learning from Failures
Metrics to Track

🏆 Checklist

The Checklist
Bonus Achievements
Share Your Journey

🔄 AI-Driven Operating Loops

📁 Copy to: OPERATING_LOOPS.md or create individual files for each loop

🤖 Key Principle: These loops are designed to be AI-driven. Once you initiate a loop, your AI assistant takes the wheel—investigating, implementing, and iterating. Your role shifts to approving transitions between steps and catching if the AI heads in the wrong direction. Think of yourself as the navigator while the AI drives.

Modern development is about choosing the right operating loop for your current objective. Here are the 8 key loops to master:

When vibe coding with AI, these operating loops define how work gets done. Each loop serves a different purpose and operates on a different timescale. The skill is recognizing which loop you need, initiating it with clear intent, then letting your AI execute while you guide and approve.

How AI-Driven Loops Work:

Human: "We have a failing test in CI"

AI: "I'll enter the CI Green Loop. Let me check the status..."

[AI investigates and presents findings]

AI: "I found a type error in Button.test.tsx. Should I fix it?"

Human: "Yes, go ahead"

[AI implements fix]

AI: "Fix implemented and tests pass locally. Ready to push?"

Human: "Yes"

[AI pushes and monitors]

AI: "CI is green! Should I check for similar issues elsewhere?"

Human: "No, we're good. Exit the loop."

The human provides direction and catches mistakes, while the AI handles the execution. This is the essence of vibe coding—you provide clear instructions and context, the AI transforms them into code through its reasoning and pattern-matching capabilities.

🟢 The CI Green Loop

📁 Copy to: CI_GREEN_LOOP.md

Goal: Get all CI checks passing and keep them green Cycle Time: 5-15 minutes per iteration AI Role: Execute investigation and fixes Human Role: Approve fixes and verify approach

When CI is red, development stalls. A broken build blocks deployments and creates cascading problems. In this loop, your AI investigates failures, implements fixes, and monitors results while you ensure the fixes are appropriate.

How to Initiate This Loop

Human: "CI is failing on our latest commit"

AI: "I'll enter the CI Green Loop to investigate and fix. Let me start by checking the CI status..."

The Loop Steps

1. Check Status (1-2 min)

AI executes:

# Quick status check
gh run list --limit 5

# If failed, get details
gh run view <run-id> --log | grep -A 20 "Error:"

What to look for:

Which jobs failed? (tests, lint, typecheck, build)
Is it environment-specific? (works locally but not CI)
Are multiple PRs failing? (indicates systemic issue)

2. Identify Root Cause (2-5 min)

Common failure patterns:

Type errors: Usually from incomplete refactoring
Test failures: Often timing/race conditions in CI environment
Lint errors: Format issues or new rule violations
Build failures: Missing dependencies or configuration

Investigation checklist:

Read the full error message (not just the summary)
Check if it's a new failure or existing flaky test
Verify your local environment matches CI
Look for recent changes that could cause the issue

3. Fix Locally (5-10 min)

# Reproduce CI environment locally
yarn install --frozen-lockfile
yarn typecheck
yarn lint
yarn test

# For specific test failures
yarn test -- --run path/to/failing.test.ts

Fix strategies by type:

Type errors: Let your AI read the error and suggest fixes
Test failures: Add logging, increase timeouts, fix race conditions
Lint errors: Run yarn lint:fix first, manual fix if needed
Build errors: Clear caches, check dependency versions

4. Verify Fix (2-3 min)

Before pushing, ensure your fix is complete:

# Run the exact CI commands locally
yarn typecheck && yarn lint && yarn test && yarn build

5. Push & Monitor (1-2 min)

# Push your fix
git add -A && git commit -m "fix: [specific description of what you fixed]"
git push

# Watch the CI run
gh run watch

Success Metrics

✅ All CI checks green within 15 minutes
✅ Zero flaky tests
✅ Fast feedback on every commit
✅ Local checks match CI results

When to Use

After making any code changes
Before merging PRs
When onboarding new team members
Daily health checks

Common Anti-patterns to Avoid

❌ Pushing "just to see if CI passes"
❌ Skipping tests instead of fixing them
❌ Ignoring "minor" CI failures
❌ Not running checks locally first

Template for Your CI Green Loop Log

# CI Green Loop Log

## Overview
Maintaining green CI through rapid iteration cycles.

## Iteration [N]: [Description]
**Date**: YYYY-MM-DD
**Issue**: [What broke]
**Root Cause**: [Why it broke]
**Fix**: [What you did]
**Prevention**: [How to avoid this in future]
**Time**: [How long it took]

### Lessons Learned
- [Key insight from this iteration]

🎯 The User Story → Test → UI Loop

📁 Copy to: USER_STORY_LOOP.md

Goal: Build features that actually solve user problems Cycle Time: 30-60 minutes per iteration

The User Story Loop helps ensure you're building features that solve real problems. When vibe coding, it's easy to get caught up in technical solutions. This loop starts with the user's actual need and validates that your solution works for them.

AI's Dual Role: Your AI assistant plays both user and developer here. First, it helps generate realistic user stories from a user's perspective. Then it switches to developer mode to help implement them.

The Loop Steps

1. Generate User Story with AI (5-10 min)

Start with a problem, not a solution:

Human: "Users are having trouble finding recent changes"

AI (as user): "As a developer reviewing code, I want to see which files have uncommitted changes so I can quickly navigate to files I'm actively working on. Currently I have to run git status in terminal and manually search for files."

Good user stories include:

Context: When/where this happens
Current pain: What's wrong now
Desired outcome: What success looks like
Value: Why this matters

2. Create E2E Test First (10-15 min)

Write the test before any implementation:

// user-story-uncommitted-files.test.ts
it('shows uncommitted file indicators in file tree', async () => {
  // Given: A file with uncommitted changes
  await modifyFile('src/components/Button.tsx');
  
  // When: User opens the file explorer
  await page.click('[data-testid="file-explorer"]');
  
  // Then: Modified file has visual indicator
  const fileItem = await page.locator('[data-file="src/components/Button.tsx"]');
  await expect(fileItem).toHaveClass(/modified/);
  await expect(fileItem.locator('.status-badge')).toContainText('M');
});

Test checklist:

Tests the user journey, not implementation
Uses realistic data
Includes edge cases (no changes, many changes)
Verifies accessibility

3. Watch Test Fail (2-3 min)

Run the test and verify it fails for the right reason:

# This should fail because feature doesn't exist yet
yarn test user-story-uncommitted-files.test.ts

What to verify:

Test fails because UI doesn't exist (not test errors)
Failure message clearly shows what's missing
Test structure makes sense

4. Build Minimal UI (15-20 min)

Implement just enough to make the test pass:

// First iteration - just make it work
const FileExplorer = () => {
  const [gitStatus, setGitStatus] = useState<GitStatus>({});
  
  useEffect(() => {
    // Fetch git status
    getGitStatus().then(setGitStatus);
  }, []);
  
  return (
    <div data-testid="file-explorer">
      {files.map(file => (
        <div 
          key={file.path}
          data-file={file.path}
          className={gitStatus[file.path] ? 'modified' : ''}
        >
          {file.name}
          {gitStatus[file.path] && (
            <span className="status-badge">M</span>
          )}
        </div>
      ))}
    </div>
  );
};

5. Verify Test Passes (2-3 min)

# Should now pass
yarn test user-story-uncommitted-files.test.ts

6. AI User Review (5-10 min)

Show the working UI to your AI for user feedback:

Human: "Here's the implementation" [screenshot]

AI (as user): "Good start! Issues I notice:

The 'M' badge is too subtle - consider orange background

No indication for staged vs unstaged changes

What about new files (untracked)?

Badge placement crowds the filename on long names"

7. Iterate Based on Feedback (10-15 min)

Enhance based on user feedback:

Add visual improvements
Handle more git states
Improve accessibility
Add hover states

Success Metrics

✅ Tests accurately represent user workflows
✅ UI solves the actual user problem
✅ Stories evolve based on UI insights
✅ Real users find it intuitive

When to Use

Building new features
Improving existing workflows
Fixing UX issues
Understanding user needs

Common Patterns

Start broad, then narrow: "File management" → "See changes" → "Git status badges"
Test accessibility early: Screen readers, keyboard nav
Get feedback often: Every iteration, not just at end
Document decisions: Why you chose certain UI patterns

Template for Your User Story Loop Log

# User Story Loop Log

## Story: [Feature Name]
**User Problem**: [What users struggle with]
**Success Criteria**: [How we'll know it's solved]

### Iteration 1: [Aspect]
**Date**: YYYY-MM-DD

#### User Story
"As a [user type], I want [capability] so that [benefit]"

#### Test First

// What we wrote before implementing

Implementation

What we built
Key decisions made

User Feedback

AI feedback as user
Real user feedback if available

Next Iteration

What to improve based on feedback

🐛 The Bug Investigation Loop

📁 Copy to: BUG_INVESTIGATION_LOOP.md

Goal: Not just fix bugs, but prevent similar bugs systematically Cycle Time: 15-45 minutes per iteration

The Bug Investigation Loop turns debugging into systematic problem-solving. Instead of making random changes, you follow the evidence, form hypotheses, and test them methodically. This is especially important in vibe coding where you need to understand code you didn't personally write.

The Loop Steps

1. Reproduce Reliably (5-10 min)

First rule of debugging: If you can't reproduce it, you can't fix it.

// Create minimal reproduction test
it('reproduces the reported bug', () => {
  // Exact steps from bug report
  const timeline = render(<Timeline entries={mockEntries} />);
  fireEvent.click(timeline.getByText('FileA.tsx'));
  
  // Bug: File doesn't open
  expect(screen.queryByTestId('file-viewer')).toBeNull();
});

Reproduction checklist:

Can reproduce locally
Minimal test case created
Happens consistently (not flaky)
Screenshot/recording if UI bug

2. Trace Root Cause (10-20 min)

Detective work starts here. Use all available tools:

# Add strategic console.logs
console.log('Timeline onClick:', entry);
console.log('Navigation state:', navigationState);

# Use debugger
debugger; // Step through the problematic flow

# Check git blame
git blame -L 93,100 NavigationDemoApp.tsx

Investigation techniques:

Binary search: Comment out half the code, see if bug persists
Time travel: When did this last work? Check git history
Assumption checking: List what you think is happening, verify each
Ask your AI: Show the bug and relevant code, get hypotheses

⚠️ Critical Step - Confirm Before Fixing: Before implementing any fix, ALWAYS confirm your hypothesis with debug logs:

// Don't assume - verify with logs first!
console.log('HYPOTHESIS: Navigation fails because state is:', navigationState);
console.log('HYPOTHESIS: Entry path is undefined:', entry.path);
console.log('HYPOTHESIS: Component unmounted:', component.isMounted());

// Only proceed with fix after logs confirm your theory

This prevents fixing the wrong problem and introducing new bugs.

3. Fix the Immediate Issue (5-10 min)

Fix the bug with minimal changes first:

// Before: Race condition
onClick={() => {
  setTimeout(() => navigateToFile(entry.path), 500);
}}

// After: Direct navigation
onClick={() => {
  navigateToFile(entry.path);
}}

Fix principles:

Minimal change that fixes the issue
Don't refactor while fixing (separate concerns)
Verify fix with your reproduction test

4. Improve Type Safety (5-10 min)

Critical step: How could types have prevented this?

// Before: Loose types allowed the bug
type NavigationHandler = (path: string) => void;

// After: Stricter types prevent future bugs
type NavigationHandler = (path: string) => Promise<NavigationResult>;
type NavigationResult = 
  | { success: true; view: 'file' | 'timeline' }
  | { success: false; error: string };

Type improvements to consider:

Make impossible states impossible
Use branded types for IDs
Add exhaustiveness checking
Remove any types near the bug

5. Add Comprehensive Tests (5-10 min)

Beyond the fix: Test the entire flow

describe('Timeline to FileViewer navigation', () => {
  it('opens file immediately when clicked', async () => {
    // Your reproduction test, now passing
  });
  
  it('handles rapid clicks without race conditions', async () => {
    // Click multiple files quickly
    fireEvent.click(getByText('FileA.tsx'));
    fireEvent.click(getByText('FileB.tsx'));
    
    // Should show FileB (last clicked)
    await waitFor(() => {
      expect(screen.getByTestId('file-viewer')).toHaveTextContent('FileB.tsx');
    });
  });
  
  it('handles navigation errors gracefully', async () => {
    // Test error cases too
  });
});

6. Clean Up Failed Attempts (2-5 min)

Remove debug code and failed fixes:

# Review all changes made during debugging
git diff

# Remove debug logs that didn't help
# Remove code changes that didn't fix the bug
# But wait - some might be valuable!

Decision tree for cleanup:

Debug logs that revealed the issue: Keep temporarily, remove in follow-up PR
Code improvements found while debugging: Create separate PR if valuable
Failed fix attempts: Remove completely
Helpful refactoring: Extract to separate PR titled "refactor: improvements found while fixing #123"

# Example: Extract valuable improvements
git add -p  # Selectively stage only the bug fix
git commit -m "fix: navigation race condition"

# Then in a new branch
git checkout -b improvements-from-bug-123
# Add the valuable improvements here

7. Document the Learning (3-5 min)

Make this bug impossible for others:

// Add comment explaining the fix
// IMPORTANT: Direct navigation required. setTimeout caused race
// condition where component unmounted before navigation completed.
// See bug #123 for details.
onClick={() => navigateToFile(entry.path)}

Success Metrics

✅ Bug fixed and prevention measures added
✅ Type/safety system strengthened
✅ Tests prevent regression
✅ Team learns from the bug

When to Use

Any time you encounter a bug
Code review findings
Production issues
Improving code quality

Common Bug Patterns & Prevention

Race Conditions

Symptom: Works sometimes, fails sometimes
Fix: Remove timing dependencies
Prevention: Proper async handling, no setTimeout for logic

Type Mismatches

Symptom: "Cannot read property of undefined"
Fix: Add null checks
Prevention: noUncheckedIndexedAccess, strict null checks

State Synchronization

Symptom: UI shows stale data
Fix: Single source of truth
Prevention: Unidirectional data flow

Template for Your Bug Investigation Log

# Bug Investigation Log

## Bug #[N]: [Description]
**Date**: YYYY-MM-DD
**Reported by**: [User/Test/Monitoring]
**Severity**: [Critical/High/Medium/Low]

### 1. Reproduction
**Steps**:
1. [Step one]
2. [Step two]

**Expected**: [What should happen]
**Actual**: [What actually happens]

### 2. Root Cause
**Investigation**:
- [What you checked]
- [What you discovered]

**Root cause**: [The actual problem]

### 3. Fix
**Code changed**:

- [old code]
+ [new code]

4. Prevention

Type improvements: [What types you strengthened] Tests added: [What test coverage you added]

5. Lessons Learned

[Key insight that helps prevent similar bugs]

🚀 The Performance Optimization Loop

📁 Copy to: PERFORMANCE_LOOP.md

Goal: Identify bottlenecks and improve them systematically Cycle Time: 45-90 minutes per iteration

The Performance Loop emphasizes measurement over guesswork. Profile first, identify actual bottlenecks, fix them systematically, and measure the impact. This prevents wasting time optimizing code that isn't actually slow.

The Loop Steps

1. Measure Baseline (10-15 min)

You can't improve what you don't measure

// Add performance marks
performance.mark('timeline-render-start');
renderTimeline(entries);
performance.mark('timeline-render-end');

// Measure the duration
performance.measure(
  'timeline-render',
  'timeline-render-start', 
  'timeline-render-end'
);

// Log it
const measure = performance.getEntriesByName('timeline-render')[0];
console.log(`Timeline render: ${measure.duration}ms`);

Key metrics to capture:

Initial load time: Time to interactive
Interaction latency: Click to response
Memory usage: Baseline and after operations
Frame rate: For animations/scrolling

Tools for measuring:

# Browser DevTools
- Performance tab recording
- Memory profiler
- Network waterfall

# React specific
- React DevTools Profiler
- why-did-you-render

# Build size
- webpack-bundle-analyzer
- source-map-explorer

2. Identify the Bottleneck (15-20 min)

Find the actual slow part (not what you think is slow)

Profile analysis checklist:

// Common bottlenecks and how to spot them

// 1. Rendering too many items
<Timeline entries={allEntries} /> // 10,000 items = slow

// 2. Expensive calculations in render
const filtered = entries.filter(complexFilter); // Runs every render

// 3. Large bundle size
import { entire } from 'huge-library'; // Imports 500KB for one function

// 4. Synchronous operations
const data = readFileSync('large.json'); // Blocks main thread

3. Form Hypothesis (5-10 min)

Based on data, not assumptions

Example hypotheses:

"Timeline is slow because we render all 10K items instead of virtualizing"
"Bundle is large because we import entire lodash instead of specific functions"
"Re-renders happen because we create new objects in render"

Write your hypothesis down:

**Hypothesis**: Git status component re-renders 60 times/second because 
file watcher events create new arrays even when content hasn't changed.

**Expected fix**: Memoize the git status array to prevent unnecessary renders.

**Success criteria**: Reduce re-renders to only when files actually change.

4. Implement Fix (20-30 min)

One change at a time

// Before: Creates new array every time
const gitStatus = files.map(f => ({
  ...f,
  status: getGitStatus(f.path)
}));

// After: Memoized to prevent re-renders
const gitStatus = useMemo(
  () => files.map(f => ({
    ...f,
    status: getGitStatus(f.path)
  })),
  [files, lastGitUpdate] // Only recalc when these change
);

Common optimization patterns:

Virtualization: Only render visible items
Memoization: Cache expensive calculations
Code splitting: Load code when needed
Debouncing: Batch rapid updates
Web Workers: Move heavy compute off main thread

5. Measure Impact (10-15 min)

Did it actually help?

// Same measurement as baseline
performance.mark('timeline-render-start-optimized');
renderTimeline(entries);
performance.mark('timeline-render-end-optimized');

// Compare
console.log(`Before: ${baselineDuration}ms`);
console.log(`After: ${optimizedDuration}ms`);
console.log(`Improvement: ${(1 - optimizedDuration/baselineDuration) * 100}%`);

Decision matrix:

>20% improvement: Keep it
5-20% improvement: Keep if code isn't too complex
<5% improvement: Revert unless code is simpler
Made it worse: Definitely revert

6. Document & Iterate (5-10 min)

Record what worked (and what didn't)

// Add performance documentation
/**
 * Timeline renders 10K+ items efficiently using virtualization.
 * 
 * Performance characteristics:
 * - Initial render: <100ms for any number of items
 * - Scroll: 60fps maintained
 * - Memory: O(visible items) not O(total items)
 * 
 * See PERF_OPTIMIZATION_LOG.md for details
 */

Success Metrics

✅ Measurable performance improvements (with numbers)
✅ User-perceived speed increases
✅ Resource usage reduction
✅ No functionality regression

When to Use

App feels slow
CI takes too long
Memory usage high
Resource constraints
Before major releases

Performance Budget Template

// performance.budget.js
module.exports = {
  bundles: [{
    name: 'main',
    maxSize: '200KB'
  }],
  
  metrics: {
    'first-contentful-paint': 1000,    // 1 second
    'time-to-interactive': 3000,       // 3 seconds
    'total-blocking-time': 300,        // 300ms
  },
  
  resourceCounts: {
    script: 10,
    stylesheet: 5,
    image: 50,
  }
};

Template for Your Performance Optimization Log

# Performance Optimization Log

## Optimization #[N]: [What you're optimizing]
**Date**: YYYY-MM-DD
**Trigger**: [What made you investigate]

### 1. Baseline Metrics
- **Metric**: [Value] (method used)
- **User impact**: [What users experience]

### 2. Bottleneck Analysis  
**Profile findings**:
- [What profiler showed]
- [Root cause identified]

### 3. Hypothesis
**Theory**: [What you think will help]
**Expected improvement**: [Quantified goal]

### 4. Implementation

- [Code before]
+ [Code after]

5. Results

Metric: [New value] ([X]% improvement)
Trade-offs: [Any downsides]

6. Lessons Learned

[Key insight about performance]

🎨 The UI Improvement Loop

📁 Copy to: UI_IMPROVEMENT_LOOP.md

Goal: Continuous UI enhancement through autonomous AI analysis Cycle Time: 30-60 minutes per iteration

The UI Improvement Loop empowers your AI to proactively identify UI issues, generate user stories, create tests, and suggest improvements. Instead of waiting for user complaints, your AI continuously analyzes the interface and drives improvements. This creates a virtuous cycle where every iteration makes the UI more intuitive.

Key difference: The AI doesn't wait for you to ask—it actively hunts for UI improvements and presents them with tests and implementations ready to go.

The Loop Steps

1. AI Analyzes Current UI (5-10 min)

Give your AI broad access to analyze the interface:

Human: "Analyze our Timeline component for UX improvements. Here's the current state: [screenshot/code]"

AI: "I've identified these improvement opportunities:

Accessibility: No keyboard navigation for timeline entries

Visual Hierarchy: All entries look the same, recent changes not emphasized

Information Density: Users must click to see any file details

Performance: Renders all entries even when 1000+ items

Let me create user stories for each..."

2. AI Generates User Stories (5-10 min)

The AI creates stories from a user perspective:

// AI generates without prompting:
const userStories = [
  {
    story: "As a developer reviewing changes, I want to see the most recent changes highlighted so I can quickly identify what's new",
    acceptance: [
      "Recent items (< 1 hour) have visual emphasis",
      "Today's items are grouped together",
      "Older items are progressively de-emphasized"
    ],
    testCase: "timeline-highlight-recent.test.ts"
  },
  {
    story: "As a keyboard user, I want to navigate timeline entries without a mouse so I can maintain my flow",
    acceptance: [
      "Tab key moves between entries",
      "Enter opens the selected file",
      "Escape closes any open previews"
    ],
    testCase: "timeline-keyboard-nav.test.ts"
  }
];

3. AI Creates Tests First (10-15 min)

Before any implementation, AI writes comprehensive tests:

// AI generates: timeline-highlight-recent.test.ts
describe('Timeline Recent Highlighting', () => {
  it('highlights entries from the last hour', () => {
    const recentEntry = { 
      timestamp: Date.now() - 30 * 60 * 1000, // 30 min ago
      file: 'recent.ts' 
    };
    const oldEntry = { 
      timestamp: Date.now() - 25 * 60 * 60 * 1000, // 25 hours ago
      file: 'old.ts' 
    };
    
    render(<Timeline entries={[recentEntry, oldEntry]} />);
    
    const recent = screen.getByText('recent.ts').closest('.timeline-entry');
    const old = screen.getByText('old.ts').closest('.timeline-entry');
    
    expect(recent).toHaveClass('recent-highlight');
    expect(old).not.toHaveClass('recent-highlight');
  });
  
  it('groups today\'s entries together', () => {
    // Test implementation
  });
});

4. AI Implements Improvements (15-20 min)

With tests in place, AI provides implementation:

// AI provides implementation
const TimelineEntry = ({ entry }) => {
  const isRecent = Date.now() - entry.timestamp < 60 * 60 * 1000;
  const isToday = isDateToday(entry.timestamp);
  
  return (
    <div 
      className={classNames('timeline-entry', {
        'recent-highlight': isRecent,
        'today-entry': isToday,
        'older-entry': !isToday
      })}
      tabIndex={0}
      onKeyDown={(e) => {
        if (e.key === 'Enter') {
          onSelectEntry(entry);
        }
      }}
    >
      {isRecent && <span className="new-badge">NEW</span>}
      <TimeAgo timestamp={entry.timestamp} />
      <FileName path={entry.file} />
    </div>
  );
};

5. AI Reviews Own Implementation (5-10 min)

The AI critiques its own work from a user perspective:

AI: "Review of my implementation: ✅ Keyboard navigation works well ✅ Visual hierarchy is clearer ⚠️ Color contrast on 'NEW' badge might be insufficient ⚠️ No transition animation when items age out of 'recent' ❌ Missed opportunity: Could add hover preview of file changes

Shall I iterate on these issues?"

6. Document and Loop (5 min)

AI documents what was learned and identifies next improvements:

## Iteration Summary
**Problem**: Timeline lacked visual hierarchy and accessibility
**Solution**: Added recency highlighting and keyboard navigation
**Impact**: 
- Keyboard users can now navigate without mouse
- 73% faster to identify recent changes (measured)

**Next Opportunities**:
1. Add hover previews for quick file inspection
2. Implement virtual scrolling for performance
3. Add filtering by file type/author

Success Metrics

✅ UI issues found and fixed before user reports
✅ Accessibility score improvements
✅ Reduced time to complete common tasks
✅ Each iteration has tests and documentation

When to Use

Weekly UI review sessions
After shipping new features
When metrics show user friction
Before major releases

Setting Up Autonomous UI Improvement

1. Give AI Context

// UI_CONTEXT.js - give this to your AI
export const UI_PRINCIPLES = {
  accessibility: {
    keyboardNav: "All interactive elements must be keyboard accessible",
    screenReaders: "Use semantic HTML and ARIA labels",
    colorContrast: "WCAG AA minimum"
  },
  performance: {
    renderTime: "<100ms for user interactions",
    bundleSize: "Keep components under 50KB"
  },
  patterns: {
    feedback: "Every action needs immediate visual feedback",
    errors: "Show errors inline, not in modals",
    loading: "Skeleton screens over spinners"
  }
};

2. Schedule Regular Reviews

# Weekly AI UI review
"Review all components modified this week for UI improvements"
"Focus on accessibility and performance"
"Generate at least 3 improvement stories with tests"

Template for Your UI Improvement Log

# UI Improvement Loop Log

## Overview
Continuous UI enhancement through AI-driven analysis and testing.

## Week of [Date]

### Components Analyzed
- [Component 1]
- [Component 2]

### Iteration 1: [Component/Feature]
**AI-Identified Issues**:
1. [Issue with severity]
2. [Issue with severity]

**Generated User Story**:
"As a [user], I want [improvement] so that [benefit]"

**Test Coverage**:
- [Test file created]
- [Scenarios covered]

**Implementation Status**:
- [x] Tests written
- [x] Implementation complete  
- [ ] Deployed to users

**Metrics**:
- Before: [Measurement]
- After: [Measurement]
- Improvement: [Percentage]

### AI Learning Notes
- [Pattern identified for future improvements]
- [New test case type to include]

### Next Iteration Focus
- [What AI will analyze next]

🧪 The Test-Driven Feature Loop

📁 Copy to: TDD_FEATURE_LOOP.md

Goal: AI builds features test-first with minimal human input Cycle Time: 20-30 minutes per iteration

In this loop, your AI takes a feature request and autonomously cycles through TDD, asking only for approval at key checkpoints. The human's role is simply to confirm direction and approve implementations.

How to Start the Loop

Human: "Add search functionality to the file explorer"

AI: "I'll build this using TDD. Let me start by writing tests for the search feature..."

[AI enters the loop]

The Autonomous Loop Steps

1. AI Writes Failing Test

AI: "First, I'll write a test for basic search functionality:

describe('FileExplorer Search', () => {
  it('filters files by search query', () => {
    const files = [
      { name: 'Button.tsx', path: '/src/Button.tsx' },
      { name: 'Input.tsx', path: '/src/Input.tsx' },
      { name: 'App.tsx', path: '/App.tsx' }
    ];
    
    const explorer = render(<FileExplorer files={files} />);
    const searchBox = explorer.getByPlaceholderText('Search files...');
    
    fireEvent.change(searchBox, { target: { value: 'button' } });
    
    expect(explorer.getByText('Button.tsx')).toBeVisible();
    expect(explorer.queryByText('Input.tsx')).not.toBeInTheDocument();
  });
});

This test will fail. Shall I proceed with implementation?"

Human: "Yes, go ahead"

2. AI Implements Minimal Solution

AI: "Now I'll implement just enough to make the test pass:

const FileExplorer = ({ files }) => {
  const [searchQuery, setSearchQuery] = useState('');
  
  const filteredFiles = files.filter(file => 
    file.name.toLowerCase().includes(searchQuery.toLowerCase())
  );
  
  return (
    <div>
      <input 
        placeholder="Search files..."
        onChange={(e) => setSearchQuery(e.target.value)}
      />
      {filteredFiles.map(file => (
        <div key={file.path}>{file.name}</div>
      ))}
    </div>
  );
};

Test is now passing. Ready for refactoring?"

Human: "Yes"

3. AI Refactors and Adds Next Test

AI: "I'll refactor for better performance and add the next test:

// Refactored with useMemo for performance
const filteredFiles = useMemo(
  () => files.filter(file => 
    file.name.toLowerCase().includes(searchQuery.toLowerCase())
  ),
  [files, searchQuery]
);

// Next test for regex search
it('supports regex search patterns', () => {
  const explorer = render(<FileExplorer files={files} />);
  const searchBox = explorer.getByPlaceholderText('Search files...');
  
  fireEvent.change(searchBox, { target: { value: '/\\.tsx?$/' } });
  
  expect(explorer.getByText('Button.tsx')).toBeVisible();
  expect(explorer.getByText('App.tsx')).toBeVisible();
  // .js files would not be visible
});

This test fails. Should I implement regex support?"

4. AI Continues Autonomously

The AI continues cycling:

Identifies what tests are missing
Writes the next logical test
Implements to pass
Refactors for quality
Asks for human approval only at major decision points

What Makes This Loop Autonomous

AI Decides Next Steps: Based on the feature request, AI determines what tests to write
AI Evaluates Completeness: Knows when the feature is "done enough"
Minimal Human Input: Human just approves direction, not implementation details
AI Tracks Progress: Maintains a mental model of what's built and what's needed

Starting Prompts for Autonomous TDD

"Build [feature] using TDD. Write comprehensive tests first." "Add [capability] to [component]. Start with failing tests." "Refactor [module] using TDD. Don't break existing functionality."

When AI Should Check In

Major architectural decisions
External dependency choices
Performance vs simplicity tradeoffs
When tests reveal ambiguous requirements

Template for AI's TDD Progress

## TDD Progress: [Feature]

### Completed Tests
- ✅ Basic search functionality
- ✅ Case-insensitive search
- ✅ Empty state handling

### Current Test

// Working on: Regex pattern support

Next Tests Planned

Search highlighting
Search history
Fuzzy matching

Questions for Human

Should search include file contents or just names?

🔍 The Code Quality Loop

📁 Copy to: CODE_QUALITY_LOOP.md

Goal: AI continuously improves code quality without prompting Cycle Time: 10-20 minutes per iteration

In this loop, your AI proactively scans the codebase, identifies quality issues, and fixes them systematically. The human just approves the improvements. This prevents technical debt from accumulating while you focus on features.

How to Start the Loop

Human: "Run a code quality sweep on the components folder"

AI: "I'll analyze code quality and fix issues systematically. Starting scan..."

[AI enters autonomous loop]

The Autonomous Loop Steps

1. AI Runs Comprehensive Checks

AI: "Running quality analysis on /components:

# Type checking
✗ 12 type errors found
  - Button.tsx: Missing return type (line 23)
  - Modal.tsx: Implicit any parameter (line 45)
  
# Linting
✗ 8 linting issues
  - Unused imports: 5 files
  - Missing deps in useEffect: 3 instances

# Complexity analysis  
⚠ 2 functions exceed complexity threshold
  - FormValidator.validate(): complexity 15 (max: 10)
  - DataTable.render(): 127 lines (max: 80)

# Security audit
✗ 1 vulnerable dependency
  - [email protected] has known vulnerability

Starting with type errors. Shall I proceed?"

Human: "Yes, fix them"

2. AI Fixes Issues Systematically

AI: "Fixing type errors:

// Button.tsx - Before
const Button = ({ onClick, children }) => { // implicit any

// Button.tsx - After  
interface ButtonProps {
  onClick: () => void;
  children: React.ReactNode;
}
const Button: React.FC<ButtonProps> = ({ onClick, children }) => {

✅ Fixed 12/12 type errors

Moving to linting issues..."

3. AI Shows Progress and Continues

AI: "Linting fixes:

Removed 5 unused imports
Added missing useEffect dependencies
Fixed 8/8 linting issues

Now addressing complexity issues. The FormValidator.validate function is too complex. I'll split it:

// Before: Single 50-line function with complexity 15
// After: Split into 5 focused functions
const validateRequired = (value: unknown): ValidationError | null => {...}
const validateFormat = (value: string, format: RegExp): ValidationError | null => {...}
const validateLength = (value: string, min: number, max: number): ValidationError | null => {...}

Shall I continue with the security vulnerability?"

4. AI Handles Dependencies

AI: "Security fix needed:

// package.json update
- "lodash": "4.17.19",
+ "lodash": "4.17.21",

This fixes the prototype pollution vulnerability. Running npm audit to confirm...

✅ All vulnerabilities resolved

Quality sweep complete! Summary:

Type errors: 12 → 0

Linting issues: 8 → 0

High complexity: 2 → 0

Security vulnerabilities: 1 → 0"

What Makes This Loop Autonomous

AI Identifies All Issues: Runs multiple analysis tools
Prioritizes Fixes: Tackles errors before warnings
Shows Clear Progress: Human sees what's being fixed
Completes Full Cycle: Doesn't stop until quality standards met

Starting Prompts

"Run code quality analysis on [folder/file]" "Fix all TypeScript errors in the codebase" "Reduce complexity in [module]" "Security audit and fix vulnerabilities"

AI Quality Checklist

The AI tracks:

TypeScript: strict: true, zero errors
Linting: Zero errors/warnings
Complexity: All functions under threshold
Security: No known vulnerabilities
Test Coverage: Maintained or improved
Bundle Size: No unexpected increases

Template for Your Code Quality Log

# Code Quality Log

## Sweep #[N]: [Date]
**Scope**: [What was analyzed]

### Issues Found
- Type errors: [count]
- Lint issues: [count]
- Complexity violations: [count]
- Security issues: [count]

### Fixes Applied
[List of significant changes]

### Metrics
- Time taken: [duration]
- Files modified: [count]
- Quality score: [before] → [after]

### Patterns Noticed
- [Common issue that keeps appearing]
- [Suggested preventive measure]

🎯 Loop Selection Strategy

Choose your loop based on current priorities:

🔥 High Priority Loops

CI Green Loop - Always maintain green CI
Bug Investigation Loop - Fix issues systematically
User Story Loop - Build what users actually need

📈 Medium Priority Loops

Performance Loop - When app feels slow
UI Polish Loop - Improve daily-use features
Mobile UX Loop - Ensure cross-platform quality

🛠️ Maintenance Loops

Test-Driven Loop - When building complex features
Code Quality Loop - Regular maintenance
Bug Investigation Loop - Turn bugs into type improvements

🚀 Meta-Loop: AI Process Optimization

📁 Copy to: META_LOOP.md

Goal: AI analyzes your development patterns and optimizes your loops Cycle Time: Weekly automated analysis

In the Meta-Loop, your AI becomes a process engineer. It analyzes your development patterns, measures loop effectiveness, and suggests concrete improvements. Instead of manual reflection, the AI presents data-driven insights and actionable changes.

How to Start the Loop

Human: "Analyze my development loops from this week"

AI: "I'll analyze your commit history, CI runs, and loop logs to identify optimization opportunities..."

[AI enters analysis mode]

The Autonomous Loop Steps

1. AI Gathers Loop Metrics

AI: "Analyzing development patterns from the past week:

Loop Usage Statistics

CI Green Loop: 12 iterations (avg 18 min, target 15 min)
Bug Investigation: 5 iterations (avg 38 min, target 45 min)
UI Improvement: 3 iterations (avg 45 min, target 60 min)
Performance Loop: 1 iteration (took 2 hours, target 90 min)

Bottleneck Analysis

🔴 CI Green Loop exceeding target by 20%

TypeScript compilation: 8 min (44% of time)
Test execution: 7 min (39% of time)
Lint/format: 3 min (17% of time)

Shall I analyze deeper?"

Human: "Yes, focus on the CI slowdown"

2. AI Identifies Root Causes

AI: "Deep analysis of CI performance:

TypeScript Compilation Bottleneck

Not using incremental compilation
Type checking test files unnecessarily
Missing project references

Test Execution Issues

Running all tests even for small changes
No test parallelization
Docker setup/teardown for each test suite

Immediate Optimization Opportunities

Enable TypeScript incremental builds (save ~3 min)
Parallelize test execution (save ~2-3 min)
Cache Docker layers properly (save ~2 min)

Expected improvement: 18 min → 11 min (39% faster)

Shall I implement these optimizations?"

3. AI Implements Process Improvements

AI: "Implementing CI optimizations:

// tsconfig.json
{
  "compilerOptions": {
    "incremental": true,
    "tsBuildInfoFile": ".tsbuildinfo"
  }
}

# .github/workflows/ci.yml
jobs:
  test:
    strategy:
      matrix:
        shard: [1, 2, 3, 4]
    steps:
      - run: yarn test --shard=${{ matrix.shard }}/4

# Dockerfile with better caching
FROM node:18 AS deps
COPY package*.json ./
RUN npm ci --only=production

Changes implemented. Next CI run will show improvements."

4. AI Suggests New Loop Patterns

AI: "Based on this week's patterns, I suggest:

New Loop: Morning Standup Loop (5 min)

AI summarizes overnight CI failures
Lists PRs ready for review
Identifies blocked work
Suggests priority for the day

Enhanced Bug Loop

Add automatic type strengthening step
Create regression test template
Link similar past bugs

Loop Transition Triggers

If CI red for >30 min → Switch to CI Green Loop
If 3+ similar bugs → Create Bug Pattern Loop
If performance degrades 20% → Trigger Performance Loop

Would you like me to set up these automated triggers?"

What Makes This Loop Autonomous

Data-Driven Analysis: AI uses real metrics, not guesses
Root Cause Investigation: Goes beyond symptoms to find causes
Implements Solutions: Doesn't just suggest—it does
Learns Patterns: Identifies when to trigger specific loops

Weekly Meta-Loop Triggers

"Analyze my loop performance this week" "Which loops are slowing me down?" "Optimize my development process" "What patterns do you see in my workflow?"

Template for Your Meta-Loop Analysis

# Meta-Loop Analysis: Week of [Date]

## Loop Performance
| Loop | Uses | Avg Time | Target | Status |
|------|------|----------|--------|--------|
| CI Green | 12 | 18 min | 15 min | 🔴 |
| Bug Fix | 5 | 38 min | 45 min | 🟢 |

## Optimizations Implemented
1. [Optimization]: [Time saved]
2. [Optimization]: [Impact]

## Process Improvements
- [New pattern discovered]
- [Automation added]

## Next Week Focus
- [Which loops need attention]
- [Tools to investigate]

💡 Key Concepts

📁 Copy to: KEY_CONCEPTS.md

1. Make Bugs Impossible, Not Just Unlikely

// Before: Bugs are possible
const item = items[index].name; // 💥 Crashes if index out of bounds

// After: Bugs are impossible
// With noUncheckedIndexedAccess: true in tsconfig.json
const item = items[index]?.name; // TypeScript FORCES you to handle undefined

💡 Note: The TypeScript flag noUncheckedIndexedAccess was controversial because it requires handling undefined cases, but it prevents an entire category of runtime errors.

2. Turn Race Conditions into Deterministic Bugs

// This test WILL find your race condition and give you a seed to reproduce it
it('detects race conditions deterministically', async () => {
  await fc.assert(
    fc.asyncProperty(fc.scheduler(), async (s) => {
      // Your async operations here
      // Fast-check will try EVERY possible execution order
    })
  );
  // Output: "Failed with seed: 1337" - now you can debug deterministically!
});

With deterministic testing, race conditions become as debuggable as simple logic errors. You can reproduce the exact failure case consistently.

3. AI as Your Fresh-Eyes Test User

Your AI functions as an always-available test user, processing UI patterns and interactions without the muscle memory or assumptions that human users develop. This transforms how you think about UI development.

// In the UI Polish Loop:

Human: "Look at this screenshot of our new feature"

AI: "I notice the 'Submit' button is grayed out but there's no indication why. Users might think it's broken."

Human: "Good catch! What else do you see?"

AI: "The error message appears 200px below the form. On mobile, users would need to scroll to see why their submission failed."

💡 Advantage: AI doesn't develop muscle memory for your UI quirks, so it can consistently spot usability issues you've gotten used to.

4. Building Context Through Documentation

By maintaining living specs, your AI assistant has access to your system's architecture, past decisions, and design patterns. Each documented bug fix provides patterns the AI can reference to prevent similar issues.

# In PROJECT_SPEC.md
## Decision: Use Event Sourcing (2024-01-15)
**Why**: Need audit trail and time-travel debugging
**Trade-off**: More complex, but provides complete history
**Revisit**: When we reach 1M events/day

# AI now knows this context for all future suggestions

5. The 15-Minute CI Rule

The "CI must be green" practice has roots in manufacturing quality control. When CI takes longer than 15 minutes, developers stop running it, and quality suffers. Fast CI enables tight feedback loops.

6. Every Bug Should Improve Your Types

// Bug: User with null email crashed the system
// Don't just fix it - make it impossible:

// Before
type User = {
  email: string | null;
  name: string;
}

// After - use branded types
type VerifiedEmail = string & { _brand: 'VerifiedEmail' };
type User = {
  email: VerifiedEmail; // Can't be null, must be verified
  name: string;
}

Quick test: You can check how many potential array access bugs exist in your code:

echo '{"compilerOptions":{"noUncheckedIndexedAccess":true}}' > tsconfig.strict.json
npx tsc --project tsconfig.strict.json --noEmit

Each error represents a potential runtime crash.

⚡ Quick Wins: Implement These in Under 5 Minutes

📁 Copy to: QUICK_WINS.md

Before diving deep, here are changes you can make RIGHT NOW that will immediately improve your code:

1. Enable TypeScript's Strictest Setting

When to use this: At project start or when onboarding to an existing codebase. This is your preventative baseline—enable it once and avoid entire categories of bugs from day one. Unlike bug-driven improvements (which react to problems), this proactive setting stops bugs before they're written.

// Add to tsconfig.json
{
  "compilerOptions": {
    "noUncheckedIndexedAccess": true  // Prevents 90% of "Cannot read property of undefined"
  }
}

2. Set Up AI-Friendly Development

# Create a CLAUDE.md file for your AI assistant
echo "# Project Context for AI

## Key Decisions
- We use yarn, not npm
- We use Vitest, not Jest
- All arrays must be accessed safely

## Current Focus
- [Add your current task here]
" > CLAUDE.md

3. Install This Pre-Commit Hook

# Save as .git/hooks/pre-commit
#!/bin/sh
yarn typecheck && yarn lint || {
  echo "❌ Fix type/lint errors before committing"
  exit 1
}

4. Add VS Code Quick Fix Settings

// .vscode/settings.json
{
  "editor.formatOnSave": true,
  "editor.codeActionsOnSave": {
    "source.fixAll.eslint": true
  },
  "typescript.preferences.includePackageJsonAutoImports": "on"
}

5. Create Your First Operating Loop Tracker

# Today's Loops (add to your README)
- [ ] Morning: CI Green Loop (get everything passing)
- [ ] Feature: User Story Loop (what are we building?)
- [ ] Afternoon: UI Polish Loop (get AI feedback on screenshots)
- [ ] Evening: Code Quality Loop (clean up for tomorrow)

These five changes take minutes to implement but provide immediate value in your vibe coding workflow.

🌶️ Strong Opinions

📁 Copy to: STRONG_OPINIONS.md

1. Delete All .skip() Tests Immediately

A skipped test is worse than no test. It gives false confidence and hides broken functionality. If a test is skipped, either fix it right now or delete it (but don't delete unless the test is actually useless—if it's testing something valuable, fix it instead). No exceptions.

// This is a lie to yourself and your team
it.skip('should handle user logout', () => {
  // "TODO: fix this later" = never
});

2. Your AI Should Have Commit Access

Not to main branch, but to feature branches. Let your AI fix lint errors, update tests, and make small improvements directly. Review the commits, but let it work autonomously for mechanical tasks.

3. 100% Code Coverage is the Bare Minimum

When using AI to write code, you need strong verification methods. 100% coverage (excluding explicitly unreachable code) is your baseline, not your goal. It's the foundation that lets you confidently accept AI-generated code and refactor rapidly.

// 100% coverage is necessary but not sufficient
test('user service', () => {
  const user = new UserService();
  user.getUser('123'); // No assertions!
  expect(true).toBe(true); // This passes coverage but tests nothing
});

// Real testing goes beyond coverage
test('user service handles all edge cases', () => {
  // Property-based tests
  // Race condition tests  
  // Error scenarios
  // Performance boundaries
});

Why this matters in vibe coding: When AI writes most of your code, you need comprehensive tests to verify behavior. 100% coverage is your safety net.

4. UI Tests Are More Important Than Unit Tests

Users interact with UI, not your perfectly isolated functions. A working UI with poor unit tests ships value. Perfect unit tests with broken UI ships nothing.

5. Comments Are Usually a Code Smell

If you need a comment to explain what code does, the code is too complex. The only good comments explain WHY, not WHAT.

// Bad: explains what
// Increment the counter by one
counter++;

// Good: explains why
// We retry 3 times because the API has intermittent failures on Mondays
const MAX_RETRIES = 3;

These practices may seem extreme, but they address real problems in modern AI-assisted development where you need strong guardrails to maintain quality at speed.

💥 Lessons from Production

📁 Copy to: PRODUCTION_LESSONS.md

The Most Common Production Failures (And How to Prevent Them)

1. Array Access Errors (40% of crashes)

The Problem: Cannot read property 'x' of undefined when accessing array elements

Prevention Checklist:

// ❌ BAD: Causes crashes in production
const firstItem = items[0].name; // Crashes if array is empty
const userId = match[1].trim(); // Crashes if regex doesn't match

// ✅ GOOD: Safe array access patterns
const firstItem = items[0]?.name ?? 'Default';
const userId = match?.[1]?.trim() ?? '';

// Enable TypeScript protection
// tsconfig.json:
{
  "compilerOptions": {
    "noUncheckedIndexedAccess": true // Forces you to handle undefined
  }
}

2. Race Conditions in Async Operations (25% of crashes)

The Problem: Multiple async operations updating the same state

Prevention Checklist:

// ❌ BAD: Race condition waiting to happen
let cache = {};
async function fetchData(id) {
  const data = await api.get(id);
  cache[id] = data; // What if two calls happen simultaneously?
}

// ✅ GOOD: Use proper async state management
const cache = new Map();
const pending = new Map();

async function fetchData(id) {
  // Check if already fetching
  if (pending.has(id)) {
    return pending.get(id);
  }
  
  // Start fetch and store promise
  const promise = api.get(id).then(data => {
    cache.set(id, data);
    pending.delete(id);
    return data;
  });
  
  pending.set(id, promise);
  return promise;
}

3. Performance Cliffs with Real Data (20% of incidents)

The Problem: Code works fine with 10 items, crashes with 10,000

Prevention Checklist:

// Add performance tests with realistic data volumes
test('handles large datasets efficiently', () => {
  const items = Array.from({ length: 10000 }, (_, i) => ({
    id: i,
    name: `Item ${i}`,
    data: generateRealisticData()
  }));
  
  const start = performance.now();
  const result = processItems(items);
  const duration = performance.now() - start;
  
  expect(duration).toBeLessThan(1000); // Should complete in under 1 second
  expect(result).toHaveLength(10000);
});

// Use pagination and virtualization
// ✅ GOOD: Only render visible items
<VirtualList
  items={items}
  height={600}
  itemHeight={50}
  renderItem={renderItem}
/>

4. CI/Production Environment Differences (15% of incidents)

The Problem: Works locally, fails in production

Prevention Checklist:

# .github/workflows/ci.yml
# Match production environment exactly
jobs:
  test:
    runs-on: ubuntu-latest
    container:
      image: node:18-alpine # Same as production
    env:
      NODE_ENV: production
      TZ: UTC # Same timezone as production
    steps:
      - uses: actions/checkout@v3
      - run: npm ci --production=false
      - run: npm test
      - run: npm run build
      - run: npm run test:production # Run tests against built code

Action Items for Your Codebase

Enable TypeScript Strict Mode Today:

# Add to tsconfig.json
"noUncheckedIndexedAccess": true
"strict": true
"strictNullChecks": true

Add Production-Like Tests:

// tests/production-scenarios.test.ts
describe('Production Scenarios', () => {
  test('handles empty arrays safely', () => {
    expect(() => processItems([])).not.toThrow();
  });
  
  test('handles concurrent requests', async () => {
    const promises = Array.from({ length: 100 }, (_, i) => 
      fetchUser(i)
    );
    const results = await Promise.all(promises);
    expect(results).toHaveLength(100);
  });
});

Add Monitoring for Common Issues:

// monitoring.ts
window.addEventListener('error', (event) => {
  if (event.message.includes('Cannot read property')) {
    // This is likely an array access error
    reportToMonitoring({
      type: 'array_access_error',
      message: event.message,
      stack: event.error?.stack
    });
  }
});

🌍 Universal Practices (All Languages)

📁 Copy to: UNIVERSAL_PRACTICES.md

🎯 Core Philosophy

Always Fix, Never Delete

When encountering broken code, fix it rather than deleting it. That broken code often contains valuable business logic and edge case handling that took time to develop.

How to Apply This Principle

When Tests Fail:

# ❌ WRONG: Don't skip or delete the test
test.skip('user authentication', () => {...})

# ✅ RIGHT: Fix the underlying issue
# Step 1: Understand why it's failing
npm test -- --verbose auth.test.ts

# Step 2: Check recent changes
git log -p -- src/auth/

# Step 3: Fix the root cause, not the symptom

When Code Seems Complex:

// ❌ WRONG: Deleting because it's hard to understand
// delete this entire complex validation function

// ✅ RIGHT: Refactor while preserving behavior
// Step 1: Write tests to document current behavior
test('complex validation preserves all business rules', () => {
  // Test each edge case the complex code handles
});

// Step 2: Refactor incrementally
// Step 3: Ensure all tests still pass

When Features Seem Unused:

# Before removing any feature:
# 1. Check usage analytics
# 2. Search for references
grep -r "featureName" src/

# 3. Check commit history for context
git log --all --grep="featureName"

# 4. Ask stakeholders before removing

AI Assistant Instructions:

# Add to your AI instructions (e.g., CLAUDE.md):
- ALWAYS FIX, NEVER DELETE
- When encountering errors, fix at root cause
- NO SHORTCUTS - Don't skip tests or remove functionality
- MAINTAIN FUNCTIONALITY - All features must continue working

This principle becomes especially important in vibe coding where your AI assistant might suggest removing complex code rather than analyzing and fixing it.

Vibe Coding Needs Guardrails

When you're coding through conversation with AI, you need strong safety nets to verify the generated code:

Setting Up Your Guardrails

100% Code Coverage Setup:

// package.json
{
  "scripts": {
    "test:coverage": "vitest --coverage",
    "test:coverage:enforce": "vitest --coverage --coverage.threshold.lines=100"
  }
}

// vitest.config.ts
export default {
  test: {
    coverage: {
      provider: 'v8',
      reporter: ['text', 'html', 'lcov'],
      exclude: ['**/*.config.*', '**/*.d.ts'],
      thresholds: {
        lines: 100,
        functions: 100,
        branches: 100,
        statements: 100
      }
    }
  }
}

CI Always Green Enforcement:

# .github/workflows/ci.yml
name: CI
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - run: npm ci
      - run: npm test
      - run: npm run test:coverage:enforce
      - run: npm run typecheck
      - run: npm run lint

# Branch protection rules:
# - Require status checks to pass
# - Require branches to be up to date
# - Include administrators

No Skipped Tests Policy:

// eslint.config.js
module.exports = {
  rules: {
    'no-restricted-syntax': [
      'error',
      {
        selector: 'CallExpression[callee.property.name="skip"]',
        message: 'Skipped tests are not allowed. Fix the test or remove it.'
      },
      {
        selector: 'CallExpression[callee.object.name="test"][callee.property.name="skip"]',
        message: 'test.skip is not allowed'
      }
    ]
  }
};

Type Safety Configuration:

// tsconfig.json
{
  "compilerOptions": {
    "strict": true,
    "noUncheckedIndexedAccess": true,
    "noImplicitAny": true,
    "strictNullChecks": true,
    "strictFunctionTypes": true,
    "noImplicitThis": true,
    "noImplicitReturns": true,
    "noFallthroughCasesInSwitch": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true
  }
}

Pre-commit Hooks:

// package.json
{
  "husky": {
    "hooks": {
      "pre-commit": "npm run typecheck && npm run lint && npm test"
    }
  }
}

These guardrails enable speed, not restrict it. With comprehensive tests and type checking, you can accept AI suggestions confidently.

Code Quality Standards

Code quality isn't about perfectionism—it's about sustainability. These standards emerge from decades of collective experience showing what makes code maintainable over time. When functions grow too complex, they become impossible to understand. When parameter lists grow too long, the function is trying to do too much. When we allow silent failures, we create systems that fail mysteriously in production.

Maximum Function Complexity Examples

The key to maintainable code is keeping functions simple enough that you can understand them at a glance. Complex functions hide bugs, resist testing, and terrify other developers (including future you). Here's how different languages encourage simplicity:

Go Example:

// Good: Low complexity, single responsibility
func calculateDiscount(price float64, customerType string) float64 {
    discountRates := map[string]float64{
        "premium": 0.20,
        "regular": 0.10,
        "new":     0.15,
    }
    
    rate, exists := discountRates[customerType]
    if !exists {
        rate = 0.0
    }
    
    return price * rate
}

// Bad: High complexity, multiple responsibilities
func processOrderBad(order Order) (Result, error) {
    // Too many nested conditions and responsibilities
    // Split into smaller functions
}

Go's simplicity forces you to be explicit about error handling and avoid clever abstractions. The calculateDiscount function does one thing well - it maps customer types to discounts. No hidden complexity, no surprising behavior.

Immutable Data Structures

Mutability is the root of countless bugs. When data can change anywhere, anytime, reasoning about program behavior becomes impossible. Immutable data structures force you to be explicit about state changes, making programs easier to understand and debug.

Scala Example:

// Good: Immutable case classes and collections
case class User(
    id: UserId,
    name: String,
    email: Email,
    preferences: Set[Preference]
)

def updateUserPreferences(user: User, newPrefs: Set[Preference]): User =
    user.copy(preferences = user.preferences ++ newPrefs)

// Working with immutable collections
val users = List(user1, user2, user3)
val premiumUsers = users.filter(_.isPremium)
val updatedUsers = users.map(u => u.copy(lastSeen = Instant.now()))

Scala's case classes are immutable by default. The copy method creates a new instance with selected fields changed, leaving the original untouched. This makes it impossible to accidentally modify shared state - a common source of bugs in concurrent programs.

Explicit Error Handling

Silent failures are time bombs in your codebase. When errors are hidden or ignored, they surface at the worst possible moments - usually in production, usually at 3 AM. Explicit error handling forces you to consider and handle failure cases at compile time, not debug time.

Rust Example:

// Rust - Explicit error handling with Result type
#[derive(Debug, thiserror::Error)]
enum ConfigError {
    #[error("File not found: {0}")]
    FileNotFound(String),
    #[error("Parse error: {0}")]
    ParseError(#[from] serde_json::Error),
    #[error("Invalid value: {field} must be {requirement}")]
    InvalidValue { field: String, requirement: String },
}

fn load_config(path: &str) -> Result<Config, ConfigError> {
    let content = std::fs::read_to_string(path)
        .map_err(|_| ConfigError::FileNotFound(path.to_string()))?;
    
    let config: Config = serde_json::from_str(&content)?;
    
    validate_config(&config)?;
    
    Ok(config)
}

fn validate_config(config: &Config) -> Result<(), ConfigError> {
    if config.timeout_ms == 0 {
        return Err(ConfigError::InvalidValue {
            field: "timeout_ms".to_string(),
            requirement: "greater than 0".to_string(),
        });
    }
    
    Ok(())
}

Rust makes error handling impossible to ignore. The Result type forces you to handle both success and failure cases. The ? operator provides convenient error propagation while maintaining explicitness. Custom error types with thiserror make errors self-documenting - when something fails, you know exactly what went wrong and why.

📋 Collaborative Specification Development

📁 Copy to: SPECIFICATION_DEVELOPMENT.md

Living Documentation Philosophy

Traditional software development often treats specifications as contracts written in stone before coding begins. This approach fails because our understanding of the problem evolves as we build the solution. Living documentation embraces this reality.

Specifications should be living documents that evolve with your project, not static requirements written once and forgotten. They serve as both a guide and a historical record of decisions. When you discover that a feature needs to work differently than originally specified, you update the spec alongside the code. When you learn why a particular approach doesn't work, you document that learning in the spec. This way, future developers (including yourself in six months) understand not just what the system does, but why it does it that way.

Key Specification Documents

Every project needs its North Star documents—the specs that guide everything else. These aren't dusty requirements documents that nobody reads. They're living guides that answer the fundamental questions: Where are we going? How will we get there? How will we know when we've arrived? When these documents are clear and current, every team member (including your AI assistant) can make decisions aligned with the project's true goals.

1. Project Vision & Goals (PROJECT_SPEC.md)

This is your project's manifesto. It captures why the project exists and what success looks like. When feature requests pile up and scope creep threatens, this document reminds everyone what you're actually trying to achieve.

# Project Name Specification

## Vision
One paragraph describing what success looks like for this project.

## Core Principles
- Principle 1: Explanation
- Principle 2: Explanation

## Success Metrics
- Metric 1: How we measure it
- Metric 2: How we measure it

## Non-Goals
Things we explicitly choose NOT to do.

2. Technical Architecture (ARCHITECTURE.md)

Your architecture spec is the map of your system. It shows how pieces fit together, why certain patterns were chosen, and what assumptions underpin the design. This isn't about documenting every class—it's about capturing the big decisions that shape everything else.

# Architecture Specification

## System Overview
High-level architecture diagram and description.

## Key Design Decisions
### Decision: Use Event Sourcing
**Context**: Need audit trail and time-travel debugging
**Decision**: Implement event sourcing for state management
**Consequences**: More complex, but provides complete history
**Date**: 2024-01-15
**Revisit**: When we reach 1M events/day

## Component Specifications
### Component Name
- **Purpose**: What it does
- **Interfaces**: How it connects
- **Invariants**: What must always be true
- **Example Usage**: Code example

3. Process Specification (PROCESS_SPEC.md)

In AI-assisted development, your process documentation helps the AI understand how you prefer to work and what patterns to follow. It's like giving your AI assistant an employee handbook—here's how we do things here, here's what good looks like, here's how we measure success.

# Process Specification

## Our Development Philosophy
We practice AI-assisted development - iterative, feedback-driven coding where the AI writes most of the implementation based on our specifications.

## Operating Loops We Use
1. **CI Green Loop** (5-15 min) - Our default state
2. **Bug Investigation Loop** (15-45 min) - When issues arise
3. **UI Polish Loop** (20-40 min) - After features work
4. **Performance Loop** (45-90 min) - When things feel slow

## Iteration Cadence
- **Micro**: Every commit (5-15 minutes)
- **Minor**: Every feature (2-4 hours)
- **Major**: Every week (retrospective)

## How We Learn
1. **Test First**: Write tests that describe what we want
2. **Implement**: Make the tests pass
3. **Reflect**: What did we learn? Update specs
4. **Iterate**: Apply learnings to next cycle

## UI Improvement Process
Following UI_IMPROVEMENT_LOOP.md:
1. Test current UI with real use cases
2. Identify friction points
3. Design improvements
4. Implement with tests
5. Validate with users
6. Document learnings

## Measuring Success
- CI stays green >95% of time
- Features ship within estimated loops
- Bug fix includes prevention measure
- Each iteration improves velocity

## Process Evolution
This process itself is a living document. We update it when:
- A loop consistently takes longer than expected
- We discover a new effective pattern
- Team feedback suggests improvements
- Metrics show process bottlenecks

Collaborative Spec Development Process

The best specifications emerge from dialogue. When you're working with AI, this dialogue becomes explicit and traceable. You bring the domain knowledge and business context. The AI brings technical patterns and edge case awareness. Together, you iterate toward specifications that are both ambitious and achievable.

Initial Creation

The birth of a spec is a conversation. You start with a need, often vague. Through back-and-forth with your AI assistant, that need crystallizes into concrete interfaces, clear constraints, and testable requirements. This process isn't just about getting to the answer—it's about discovering the right questions.

Human + AI Collaboration Pattern: // Human provides context "I need a file sync system that handles conflicts"

// AI asks clarifying questions "What types of conflicts? How should they be resolved?"

// Human provides constraints "Last-write-wins for now, but log all conflicts"

// AI drafts initial spec interface FileSyncSpec { conflictResolution: "last-write-wins" | "manual" | "merge"; conflictLog: ConflictEvent[]; syncStrategy: "immediate" | "batched" | "scheduled"; }

// Human refines "Add offline support and partial sync"

// Iterate until complete

Spec Evolution Examples

Specifications aren't carved in stone—they grow like living organisms. Each bug fixed, each feature added, each performance bottleneck resolved teaches you something new about your domain. The best teams capture these learnings by evolving their specs. What starts as a simple interface grows richer as you discover edge cases, add error handling, and optimize for real-world usage patterns.

Java Example - Evolving API Spec:

// Version 1.0 - Initial spec
public interface UserService {
    User createUser(String email, String password);
    User getUser(Long id);
}

// Version 1.1 - After discovering auth needs
public interface UserService {
    User createUser(String email, String password);
    User getUser(Long id);
    // ADDED: v1.1 - Need for API authentication
    User getUserByToken(String authToken);
}

// Version 2.0 - After performance issues
public interface UserService {
    CompletableFuture<User> createUser(String email, String password);
    CompletableFuture<User> getUser(Long id);
    // CHANGED: v2.0 - Made async for better performance
    CompletableFuture<User> getUserByToken(String authToken);
    // ADDED: v2.0 - Batch operations for efficiency
    CompletableFuture<List<User>> getUsers(List<Long> ids);
}

TypeScript Example - Growing Feature Spec:

// specs/search-feature.spec.ts - Version 1
export interface SearchSpec {
    capabilities: {
        textSearch: boolean;
        filters: string[];
        maxResults: number;
    };
    
    requirements: {
        responseTime: "<100ms for 95% of queries";
        accuracy: "90%+ relevance score";
    };
}

// Version 2 - After user feedback
export interface SearchSpec {
    capabilities: {
        textSearch: boolean;
        fuzzySearch: boolean;        // ADDED: Users need typo tolerance
        filters: string[];
        maxResults: number;
        pagination: boolean;         // ADDED: Large result sets
    };
    
    requirements: {
        responseTime: "<100ms for 95% of queries";
        accuracy: "90%+ relevance score";
        typoTolerance: "1-2 character errors"; // ADDED
    };
    
    // ADDED: Specific examples to clarify behavior
    examples: {
        fuzzySearch: [
            { input: "teh", expected: ["the", "tea", "tech"] },
            { input: "pythn", expected: ["python"] }
        ];
    };
}

Specification Best Practices

📁 Copy to: SPEC_BEST_PRACTICES.md

Great specifications tell a story. They capture not just what your system does, but why it does it that way. They record the battles you've fought, the trade-offs you've made, and the lessons you've learned. When done right, specs become the collective memory of your project—saving future developers from repeating past mistakes.

1. Include Both What and Why

The most dangerous specifications are the ones that tell you what to build but not why. Six months later, when requirements change, you're left staring at code wondering: "Is this still needed? Can I change it? What breaks if I do?" The 'why' is what makes specs truly valuable—it's the context that enables intelligent decisions.

Go Example:

// specs/rate-limiter.md
/*
## Rate Limiter Specification

### What
- Limit API calls to 100 requests per minute per user
- Use sliding window algorithm
- Return 429 status when limit exceeded

### Why
- Prevent API abuse (we had DDoS in Q3 2023)
- Ensure fair resource usage across customers
- Sliding window prevents burst exploitation

### Implementation Notes
*/

type RateLimiter interface {
    // Check returns true if request is allowed
    // Spec: Must be O(1) operation for performance
    Check(userID string) bool
    
    // Reset clears limits for testing
    // Spec: Only available in test builds
    Reset(userID string)
}

C# Example:

// Specs/CacheSpec.cs
namespace ProjectSpecs
{
    /// <summary>
    /// Cache Specification v2.1
    /// 
    /// Purpose: Reduce database load by 80%
    /// Strategy: Two-tier cache (memory + Redis)
    /// 
    /// History:
    /// - v1.0: Memory only (failed at scale)
    /// - v2.0: Added Redis tier
    /// - v2.1: Added cache warming
    /// </summary>
    public interface ICacheSpec
    {
        // Requirement: 99.9% cache availability
        TimeSpan DefaultExpiration { get; }
        
        // Requirement: <10ms read latency
        Task<T?> GetAsync<T>(string key);
        
        // Requirement: Write-through to database
        Task SetAsync<T>(string key, T value, TimeSpan? expiration = null);
    }
}

2. Track Decisions and Trade-offs

Every technical decision is a bet on the future. You're betting that your requirements won't change, that your scale won't explode, that your team's skills will evolve in certain ways. Smart teams document these bets. They record what options they considered, what they chose, and most importantly—under what conditions they'd reconsider. This turns past decisions from mysterious edicts into learning opportunities.

Python Example:

# specs/data_pipeline_spec.py
"""
Data Pipeline Specification

## Decision Log

### 2024-01: Chose Batch over Streaming
- **Options Considered**: 
  1. Real-time streaming (Kafka + Flink)
  2. Micro-batching (Spark Streaming) 
  3. Traditional batch (Airflow + Spark)
- **Decision**: Traditional batch
- **Rationale**: 
  - 15-minute data freshness acceptable
  - Simpler operations (team expertise)
  - 70% lower infrastructure cost
- **Revisit When**: 
  - Need <5 minute freshness
  - Team gains streaming expertise
  
### 2024-03: Added Incremental Processing
- **Problem**: Full reprocessing taking 6+ hours
- **Solution**: Track high watermarks, process only new data
- **Trade-off**: More complex state management
"""

from dataclasses import dataclass
from typing import Protocol, List
from datetime import datetime

class DataPipelineSpec(Protocol):
    """Specification for data pipeline components"""
    
    def process_batch(
        self, 
        start_time: datetime, 
        end_time: datetime
    ) -> BatchResult:
        """Process data within time window"""
        ...
    
    def get_watermark(self) -> datetime:
        """Get last successfully processed timestamp"""
        ...

3. Use Specs as Test Contracts

The best specifications are executable. Instead of hoping your implementation matches the spec, you encode the spec as tests that must pass. This creates a powerful feedback loop: when the spec changes, tests fail, forcing you to update the implementation. When you discover edge cases, you add them to both the spec and its tests. The specification becomes a living contract that the code must honor.

Rust Example:

// specs/reliability_spec.rs

/// Reliability Specification
/// 
/// This spec defines the reliability guarantees our system provides.
/// All implementations MUST pass these tests.

pub trait ReliabilitySpec {
    type Error;
    
    /// Messages must be delivered exactly once
    async fn deliver_message(&self, msg: Message) -> Result<DeliveryReceipt, Self::Error>;
    
    /// System must auto-recover from transient failures
    async fn handle_failure(&self, error: Self::Error) -> RecoveryAction;
}

#[cfg(test)]
mod spec_tests {
    use super::*;
    
    /// Any implementation of ReliabilitySpec must pass this test
    async fn test_exactly_once_delivery<T: ReliabilitySpec>(system: &T) {
        let msg = Message::new("test");
        
        // Send same message twice
        let receipt1 = system.deliver_message(msg.clone()).await.unwrap();
        let receipt2 = system.deliver_message(msg.clone()).await.unwrap();
        
        // Must get same receipt (idempotent)
        assert_eq!(receipt1.id, receipt2.id);
        
        // Must have delivered exactly once
        assert_eq!(get_delivery_count(msg.id), 1);
    }
}

Maintaining Specs with AI

Your AI assistant can be your spec's best friend. It never forgets to check if the implementation matches the spec. It notices when bug reports reveal gaps in your specifications. It can even help evolve your specs based on patterns it sees across your codebase. The key is establishing regular review cycles where you and your AI examine specs against reality.

Regular Review Pattern

📁 Copy to: SPEC_REVIEW_PATTERN.md

Kotlin Example:

// Weekly spec review with AI
class SpecReview {
    fun reviewWithAI() {
        """
        Human: "Review our search spec against last week's bug reports"
        
        AI: "Found 3 issues:
        1. Spec doesn't cover empty query behavior (Bug #123)
        2. No mention of special character handling (Bug #125)  
        3. Performance requirement unrealistic for fuzzy search (Bug #130)"
        
        Human: "Update spec to address these"
        
        AI: "Here's the updated spec with additions marked..."
        """.trimIndent()
    }
}

// Spec evolves based on real-world learning
interface SearchSpecV3 {
    fun handleEmptyQuery(): SearchResult  // ADDED: Based on Bug #123
    fun escapeSpecialChars(query: String): String  // ADDED: Bug #125
    
    companion object {
        // UPDATED: Relaxed for fuzzy search based on Bug #130
        const val FUZZY_SEARCH_TARGET_LATENCY = "200ms"
        const val EXACT_SEARCH_TARGET_LATENCY = "100ms"
    }
}

Spec Organization in Repository

Where you put your specs matters as much as what you put in them. Scattered across random directories, they become forgotten artifacts. Centralized in one place, they become the beating heart of your project's knowledge. The best organization makes specs discoverable by both humans browsing the repo and AI assistants trying to understand your system.

project-root/ ├── README.md # Points to specs ├── specs/ │ ├── README.md # Spec overview & index │ ├── PROJECT_SPEC.md # Overall vision │ ├── PROCESS_SPEC.md # How we work & iterate │ ├── ARCHITECTURE.md # Technical architecture │ ├── API_SPEC.md # API contracts │ ├── features/ │ │ ├── search.spec.md │ │ ├── auth.spec.md │ │ └── sync.spec.md │ └── decisions/ │ ├── 2024-01-database-choice.md │ ├── 2024-02-caching-strategy.md │ └── 2024-03-api-versioning.md ├── src/ │ └── [implementation following specs] └── tests/ └── spec-compliance/ # Tests that verify spec compliance

The Specification Loop

The specification operating loop transforms documentation from a chore into a powerful development tool. This isn't about bureaucracy—it's about learning faster and building better software.

Write initial spec (Human + AI collaboration)
Implement against spec (With AI assistance)
Discover gaps/issues (Through usage)
Update spec (Document learning)
Refactor if needed (Maintain alignment)
Repeat

This creates a virtuous cycle where specifications improve based on real-world experience, and implementations stay aligned with evolved understanding. Each iteration makes the spec more accurate and the code more purposeful. The AI assistant becomes more helpful over time because it has access to your accumulated wisdom in the specs. New team members onboard faster because the specs explain not just the what, but the why and the why-not.

🏗️ Environment & Tooling

📁 Copy to: ENVIRONMENT_TOOLING.md

Development Environment

What is nix-shell?

Many large companies use Nix to ensure developers have identical environments, eliminating "works on my machine" problems.

nix-shell is a tool that creates isolated, reproducible development environments. Think of it as a more powerful version of Python's virtualenv that works for ANY language and tool.

Why use it?

Consistency: Everyone gets the exact same versions of all tools
No "works on my machine": If it works in nix-shell, it works everywhere
Clean system: Doesn't pollute your global system with dependencies
Easy onboarding: New developers just run nix-shell and have everything

How to use it:

# Install Nix (one-time setup)
curl -L https://nixos.org/nix/install | sh

# Enter the development environment
nix-shell

# Now you have all project tools available
which node  # Specific Node.js version for this project
which cargo # Specific Rust version for this project

# All commands should be run inside nix-shell
nix-shell --run "yarn install"
nix-shell --run "cargo build"

# Or enter interactive shell and run commands
nix-shell
yarn test
cargo test
exit

Integrating with Your Development Workflow:

Make it mandatory: Add to your CLAUDE.md or README: "ALWAYS use nix-shell for development"
CI consistency: Use the same shell.nix in CI to ensure parity
Editor integration: Configure your editor to use nix-shell's language servers
Git hooks: Run pre-commit hooks inside nix-shell for consistency

Example shell.nix file:

{ pkgs ? import <nixpkgs> {} }:

pkgs.mkShell {
  nativeBuildInputs = with pkgs; [
    # Version control and basic tools
    git
    gh
    ripgrep
    fd
    jq
    
    # JavaScript/TypeScript development
    nodejs_22
    nodePackages.yarn
    nodePackages.typescript
    nodePackages.typescript-language-server
    
    # Rust development
    rustc
    cargo
    rustfmt
    clippy
    rust-analyzer
    cargo-nextest
    cargo-watch
    cargo-audit
    
    # Language servers for editor integration
    pyright
    gopls
    yaml-language-server
    
    # Build dependencies
    pkg-config
    openssl
    libffi
    
    # Testing tools
    docker
    docker-compose
  ];
  
  shellHook = ''
    # Set up environment variables
    export RUST_BACKTRACE=1
    export CARGO_HOME="$PWD/.cargo"
    export PKG_CONFIG_PATH="${pkgs.openssl.dev}/lib/pkgconfig:$PKG_CONFIG_PATH"
    
    # Docker environment
    export DOCKER_BUILDKIT=1
    export COMPOSE_DOCKER_CLI_BUILD=1
    
    # Welcome message
    echo "🚀 Development environment loaded!"
    echo "   Rust $(rustc --version | cut -d' ' -f2)"
    echo "   Node $(node --version)"
    echo "   Yarn $(yarn --version)"
    echo ""
    echo "📦 Run 'make help' to see available commands"
  '';
}

Key Features of a Good shell.nix:

Language-specific tools: Include compilers, package managers, formatters, and linters
Language servers: Enable smart editor features for all languages in your project
Testing tools: Docker, test runners, and CI tools should match production
Environment setup: Configure paths, environment variables, and tool settings
Build dependencies: System libraries needed for compilation (openssl, pkg-config)
Consistent versions: Pin specific versions when needed for reproducibility

Advanced Tips:

Use nativeBuildInputs instead of buildInputs for better cross-compilation support
Set up project-specific environment variables in shellHook
Include all tools your CI uses to ensure local/CI parity
Add a helpful welcome message showing versions and common commands
Consider using direnv with use nix for automatic environment activation

Common Issues and Solutions:

Python Conflicts:

Python virtual environments can conflict with nix-shell
Use pipenv or poetry inside nix-shell, or use python311.withPackages
Set PYTHONPATH carefully to avoid conflicts

macOS Issues:

If you get SSL certificate errors, add cacert to your packages
For GUI applications on macOS, you may need additional setup
System Integrity Protection can interfere - use nix packages instead of system tools

Performance:

First run downloads packages (can take 10-20 minutes)
Use Cachix to share binary caches across team members
Consider lorri or direnv for faster shell activation

Real-World shell.nix Patterns:

For React Native projects:

# Include iOS/Android development tools
watchman
cocoapods
fastlane
android-studio

For projects with databases:

# Include database clients and tools
postgresql
redis
mysql80
mongosh

For CI/CD consistency:

# Match your CI environment exactly
shellHook = ''
  # Same environment variables as CI
  export CI=true
  export NODE_ENV=test
  
  # Verify tools match CI versions
  if [[ "$(node --version)" != "v22.0.0" ]]; then
    echo "⚠️  Node version mismatch with CI!"
  fi
'';

Package Management

📁 Copy to: PACKAGE_MANAGEMENT.md

Package management is where good intentions meet harsh reality. Every npm install or yarn add is a trust decision—you're inviting someone else's code into your project, along with all their dependencies, and their dependencies' dependencies. A single compromised package can take down thousands of projects, as we've seen with incidents like left-pad and event-stream.

Core Principles:

Consistency is non-negotiable: Choose yarn or npm at project start and stick with it. Mixing package managers creates subtle bugs that waste hours of debugging time.
Lock files are your safety net: yarn.lock or package-lock.json ensures everyone gets exactly the same versions. Commit these files always—they're as important as your source code.
Audit regularly, update thoughtfully: Run yarn audit weekly, but don't blindly update everything. Each update is a potential breaking change. Update security patches immediately, minor versions carefully, major versions with full testing.
Document everything: Your README should tell a new developer exactly how to get from zero to running code. If it takes more than three commands, you're doing it wrong.

GitHub Actions CI Setup

📁 Copy to: GITHUB_ACTIONS_SETUP.md

What is GitHub Actions?

GitHub Actions provides 2,000 free minutes per month for private repos, which is sufficient for most small teams.

GitHub Actions is GitHub's built-in CI/CD platform. It runs your tests, builds, and deployments automatically when you push code.

Key Concepts:

Workflow: A complete CI/CD process (defined in .github/workflows/)
Job: A set of steps that run on the same runner
Step: Individual task (run tests, build, deploy)
Runner: Virtual machine that executes your jobs
Action: Reusable unit of code (like a function)

Basic CI Workflow Example

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '18'
          
      - name: Install dependencies
        run: yarn install --frozen-lockfile
        
      - name: Run tests
        run: yarn test
        
      - name: Run type checks
        run: yarn typecheck
        
      - name: Run linter
        run: yarn lint

Caching for Faster CI

Why cache?

Speed: Avoid re-downloading dependencies every run
Cost: Fewer API calls to package registries
Reliability: Less dependent on external services

Yarn/NPM Caching Example:

- name: Cache node modules
  uses: actions/cache@v4
  with:
    path: |
      ~/.cache/yarn
      node_modules
    key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
    restore-keys: |
      ${{ runner.os }}-yarn-

Rust/Cargo Caching Example:

- name: Cache cargo registry
  uses: actions/cache@v4
  with:
    path: |
      ~/.cargo/registry
      ~/.cargo/git
      target
    key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}

Nix Caching with Cachix

What is Cachix? Cachix is a binary cache service for Nix that dramatically speeds up CI builds by caching compiled packages.

Setting up Cachix (Free Tier):

- uses: cachix/install-nix-action@v24
  with:
    nix_path: nixpkgs=channel:nixos-unstable
    
- uses: cachix/cachix-action@v14
  with:
    name: your-cache-name  # Use the public cache
    # No authToken needed for public caches
    
- run: nix-shell --run "yarn test"

Benefits of Cachix:

Fast builds: Download pre-built binaries instead of compiling
Free tier: Public caches are free
Shared cache: Team members benefit from each other's builds

Complete CI Example with Caching

name: Complete CI

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v4
      
      # Nix with Cachix for reproducible environment
      - uses: cachix/install-nix-action@v24
      - uses: cachix/cachix-action@v14
        with:
          name: nix-community  # Using public community cache
          
      # Node.js caching
      - name: Cache node modules
        uses: actions/cache@v4
        with:
          path: |
            ~/.cache/yarn
            node_modules
          key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
          
      # Run everything in nix-shell
      - name: Install dependencies
        run: nix-shell --run "yarn install --frozen-lockfile"
        
      - name: Run tests
        run: nix-shell --run "yarn test"
        
      - name: Type check
        run: nix-shell --run "yarn typecheck"
        
      - name: Lint
        run: nix-shell --run "yarn lint"
        
      # Upload test results
      - name: Upload coverage
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: coverage
          path: coverage/

  docker-e2e:
    runs-on: ubuntu-latest
    needs: test  # Only run after tests pass
    
    steps:
      - uses: actions/checkout@v4
      
      # Docker layer caching
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
        
      - name: Build and test
        run: |
          docker-compose -f docker-compose.test.yml build
          docker-compose -f docker-compose.test.yml up --abort-on-container-exit
          
      - name: Upload test artifacts
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: e2e-results
          path: test-results/

CI Performance Tips

1. Parallel Jobs:

jobs:
  lint:
    runs-on: ubuntu-latest
    steps: [...]
    
  test:
    runs-on: ubuntu-latest
    steps: [...]
    
  typecheck:
    runs-on: ubuntu-latest
    steps: [...]
    
  # These run in parallel!

2. Matrix Builds:

strategy:
  matrix:
    node: [16, 18, 20]
    os: [ubuntu-latest, macos-latest]
    
runs-on: ${{ matrix.os }}
steps:
  - uses: actions/setup-node@v4
    with:
      node-version: ${{ matrix.node }}

3. Conditional Steps:

- name: Deploy
  if: github.ref == 'refs/heads/main' && github.event_name == 'push'
  run: ./deploy.sh

GitHub CLI Integration

📁 Copy to: GITHUB_CLI_INTEGRATION.md

The GitHub CLI (gh) transforms how you interact with CI/CD. Instead of constantly refreshing browser tabs to check if your build passed, you can monitor and control everything from your terminal. This tool is especially powerful when paired with an AI assistant—you can share CI failures directly and get immediate help debugging.

Installation and Setup:

# Install GitHub CLI
brew install gh  # macOS
# or for Linux:
curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo gpg --dearmor -o /usr/share/keyrings/githubcli-archive-keyring.gpg

# Login (choose browser auth for simplicity)
gh auth login

Essential Commands for CI Debugging:

# Quick status check - see your last 5 workflow runs
gh run list --limit 5

# Watch a running workflow in real-time (like tail -f for CI)
gh run watch

# Deep dive into failures - see full logs
gh run view <run-id> --log

# Failed due to flaky test? Re-run just the failed jobs
gh run rerun <run-id> --failed

# Download artifacts from a workflow run
gh run download <run-id>

The Power Move - Integrating with AI:

# Capture failure logs and send to your AI assistant
gh run view <run-id> --log | grep -A 20 "Error:" > failure.txt
# Now share failure.txt with your AI for debugging help

Security Best Practice: When automating with scripts, never hardcode tokens. Instead:

# Store token securely
echo "ghp_yourtoken" > /tmp/gh_token.txt
chmod 600 /tmp/gh_token.txt

# Use in scripts
export GH_TOKEN=$(cat /tmp/gh_token.txt)
gh run list  # Now authenticated

# NEVER do this:
# echo $GH_TOKEN  # This exposes your token!

The GitHub CLI becomes indispensable once you realize you can fix CI issues without leaving your editor. Combined with an AI assistant that can read logs and suggest fixes, you'll resolve CI failures in minutes instead of hours.

🧪 Testing Excellence

📁 Copy to: TESTING_EXCELLENCE.md

Universal Testing Principles

In vibe coding, testing is non-negotiable. When AI generates most of your code, comprehensive tests are essential for verification. 100% code coverage provides the foundation for confident refactoring and rapid iteration.

NEVER SKIP TESTS - Fix failing tests instead of using .skip or .todo
Test individual functions and components in isolation
Use real services where possible in integration tests
Fast unit test execution (< 100ms per test)
High coverage of edge cases and error conditions
Write tests first for complex features (TDD)
Tests should be deterministic and repeatable
Use property-based testing for invariants
Test race conditions in concurrent code
Focus on testing business logic, not implementation details

The principle of never skipping tests deserves special attention. When a test fails, it's telling you something important—either your code is broken, your test is wrong, or your understanding of the requirements has evolved. Skipping the test silences this feedback. Instead, fix the issue or update the test to match new requirements. Every skipped test is a landmine waiting for the next developer.

Testing Categories

Each type of test serves a specific purpose in your safety net. Think of them as different zoom levels on a microscope—unit tests examine individual cells, integration tests watch how organs work together, and E2E tests verify the whole organism functions. Choosing the right test type for each scenario is as important as writing the test itself.

Unit Tests

Unit tests are your first line of defense. They're the fastest to write, fastest to run, and fastest to debug when they fail. The key to great unit tests is ruthless isolation—each test should examine exactly one piece of behavior.

Characteristics of Great Unit Tests:

Lightning fast: If a unit test takes more than 100ms, it's not a unit test. Speed matters because you'll run these thousands of times.
Surgical precision: Test one specific behavior. When it fails, you should know exactly what's broken without debugging.
Edge case hunters: This is where you test the weird stuff—empty arrays, null inputs, Unicode strings, negative numbers. If it can happen in production, test it here.
Deterministic: Same input, same output, every single time. No random data, no time dependencies, no network calls.

Integration Tests

Integration tests reveal the lies that unit tests tell. Your perfectly isolated components might work flawlessly alone but fail spectacularly when connected. Integration tests catch the impedance mismatches between systems.

What Makes Integration Tests Valuable:

Real collaborations: Test how your code actually talks to databases, APIs, and file systems. Mock as little as possible.
Data flow validation: Follow data as it moves through your system. Does that user input actually make it to the database correctly?
Error propagation: When the database is down, does your API return a proper error? When the API fails, does your UI show a helpful message?
Boundary testing: This is where you test timeouts, retries, and circuit breakers—all the stuff that only matters when systems interact.

E2E Tests

E2E tests are your users' advocates. They don't care about your beautiful architecture or clever algorithms—they care that clicking the button does what it's supposed to do. These tests are expensive to write and slow to run, but they catch the bugs that users actually experience.

The E2E Philosophy - Keep It Real:

NO MOCKING: The moment you mock in an E2E test, it's not E2E anymore. Use real databases, real files, real network calls. Yes, it's slower. Yes, it's worth it.
Real File Operations: Don't simulate file changes—actually write files to disk and verify your file watcher notices. Create real git commits and check that your git integration works.
Live System Integration: Start your actual backend, connect to real services, use genuine authentication. If it's flaky, fix the flakiness—don't hide it with mocks.
User-Centric Workflows: Don't test implementation details. Test what users actually do: "I drag a file here, type some code, hit save, and see my changes in version control."

User Stories as E2E Tests

📁 Copy to: USER_STORY_E2E_TESTING.md

The most powerful E2E tests are written as user stories. Each test tells a complete story of a user accomplishing a real task. This approach ensures your tests validate actual user value, not just technical implementation.

User Story Test Structure:

describe('User Story: Developer fixes a bug', () => {
  it('should allow a developer to find, fix, and verify a bug fix', async () => {
    // Given: A developer starts with a failing test
    await openProject('/test-project');
    await runTests();
    await expectTestFailure('Button.test.tsx');
    
    // When: They investigate the failure
    await clickOnFailingTest();
    await readErrorMessage();
    await navigateToSourceFile('Button.tsx');
    
    // And: They fix the bug
    await editCode({
      find: 'onClick={handleClick}',
      replace: 'onClick={() => handleClick()}'
    });
    await saveFile();
    
    // Then: The test passes
    await runTests();
    await expectTestSuccess('Button.test.tsx');
    
    // And: They commit their fix
    await openGitPanel();
    await stageChanges();
    await commitWithMessage('Fix: Button click handler binding issue');
  });
});

Why User Stories Work:

Complete workflows: Tests validate entire user journeys, not isolated features
Business value: Each test directly maps to user value
Natural documentation: Tests read like user documentation
Regression prevention: Protects complete workflows from breaking

Chaining User Stories: Task Transitions

📁 Copy to: CHAINED_USER_STORY_TESTS.md

Real users don't work in isolation—they flow from task to task. Your E2E tests should validate these transitions. This is where you catch the subtle bugs that occur when state from one task affects another.

Example: Development Flow Chain

describe('User Story Chain: Morning Development Flow', () => {
  let projectState: ProjectState;
  
  it('Chapter 1: Developer reviews PR feedback', async () => {
    projectState = await openProject('/my-project');
    await checkoutBranch('feature/user-auth');
    await openPullRequest('#123');
    
    const comments = await readReviewComments();
    expect(comments).toContain('Consider adding input validation');
    
    // State carries forward: PR is open, specific branch checked out
  });
  
  it('Chapter 2: Developer addresses feedback', async () => {
    // Continues from previous state
    await navigateToFile('src/auth/LoginForm.tsx');
    await addCodeBlock(`
      if (!email.includes('@')) {
        throw new ValidationError('Invalid email format');
      }
    `);
    
    await runTests();
    await expectNewTestsNeeded(); // AI suggests test for validation
  });
  
  it('Chapter 3: Developer adds tests', async () => {
    // Builds on previous changes
    await createNewFile('src/auth/LoginForm.test.tsx');
    await writeTestCase('should reject invalid emails');
    await runTests();
    await expectAllTestsPass();
    
    // Commit the complete feature
    await stageAllChanges();
    await commit('feat: Add email validation with tests');
    await pushBranch();
  });
  
  it('Chapter 4: Developer switches context', async () => {
    // Critical transition: Can they cleanly switch tasks?
    await checkoutBranch('main');
    await pullLatestChanges();
    
    // Start new task with clean state
    await createBranch('fix/performance-issue');
    
    // Verify no contamination from previous task
    await expectCleanWorkingDirectory();
    await expectNoUncommittedChanges();
  });
});

Task Transition Patterns:

State Preservation: Verify data persists appropriately between tasks

it('preserves user preferences across tasks', async () => {
  await setEditorTheme('dark');
  await completeTask('implement-feature');
  await startNewTask('fix-bug');
  await expectEditorTheme('dark'); // Preferences maintained
});

Clean Context Switching: Ensure no task contamination

it('provides clean slate for new tasks', async () => {
  await workOnFeature('complex-refactoring');
  await stashChanges();
  await switchToUrgentBug('customer-issue');
  
  // Verify clean context
  await expectNoLeftoverState();
  await expectCorrectBranchContext();
});

Interruption Handling: Test real-world interruptions

it('handles interruptions gracefully', async () => {
  await startDebugging('memory-leak');
  await setBreakpoints([152, 167, 203]);
  
  // Urgent interruption
  await receiveCriticalAlert('Production down!');
  await switchToEmergencyFix();
  
  // Can return to previous task?
  await returnToPreviousTask();
  await expectBreakpointsPreserved([152, 167, 203]);
  await expectDebugContextRestored();
});

Multi-User Story Scenarios

Some bugs only appear when multiple users interact with the system. E2E tests should cover these scenarios:

describe('Multi-User Story: Collaborative Editing', () => {
  it('handles simultaneous edits without data loss', async () => {
    // Developer A starts editing
    const devA = await openProjectAsUser('alice');
    await devA.openFile('config.json');
    await devA.startEditing();
    
    // Developer B opens same file
    const devB = await openProjectAsUser('bob');
    await devB.openFile('config.json');
    
    // Both make changes
    await devA.updateConfig('apiUrl', 'https://api-v2.example.com');
    await devB.updateConfig('timeout', 5000);
    
    // Both save
    await devA.saveFile();
    await devB.saveFile();
    
    // Verify conflict resolution
    await expectMergeConflictUI(devB);
    await devB.resolveConflict('keep-both');
    
    // Verify both changes preserved
    const finalConfig = await readConfig();
    expect(finalConfig.apiUrl).toBe('https://api-v2.example.com');
    expect(finalConfig.timeout).toBe(5000);
  });
});

User Story Test Best Practices

Name tests as stories: Use descriptive names that explain the user's goal

// Good
describe('User Story: Senior dev onboards junior developer')

// Bad  
describe('Permission system tests')

Use domain language: Write tests in the user's vocabulary

// Good
await reviewPullRequest();
await requestChanges('Please add error handling');

// Bad
await clickElement('[data-test-id="pr-review-btn"]');
await inputText('[data-test-id="comment-box"]', 'Please add error handling');

Test complete journeys: Don't stop at feature boundaries

// Complete journey
it('developer fixes bug from report to deployment', async () => {
  await readBugReport('#456');
  await reproduceLocally();
  await writeFix();
  await addTests();
  await createPR();
  await waitForReview();
  await mergeToDevelop();
  await verifyInStaging();
  await deployToProduction();
  await verifyBugFixed();
});

Capture real-world complexity: Include the messy parts

it('handles flaky test investigation', async () => {
  await runTests();
  await noticeIntermittentFailure('auth.test.ts');
  
  // Real debugging flow
  await runTestsMultipleTimes(10);
  await analyzeFailurePattern(); // Fails 3/10 times
  await addDebugLogging();
  await identifyRaceCondition();
  await fixAsyncTiming();
  await verifyTestStability(20); // Run 20 times to confirm
});

Test failure scenarios: Users encounter errors too

it('developer recovers from corrupted git state', async () => {
  await simulateGitCorruption();
  await attemptNormalOperations();
  await noticeGitErrors();
  
  // Recovery flow
  await searchForSolution();
  await backupCurrentWork();
  await cleanGitState();
  await restoreFromBackup();
  await verifyWorkPreserved();
});

Property-Based Testing

What is Property-Based Testing?

Property-based testing is a testing approach where instead of writing specific test cases, you describe properties that should always be true, and the testing framework generates random inputs to try to find counterexamples.

The shift from example-based to property-based testing is profound. With traditional testing, you're limited by your imagination—you test the cases you think of. With property-based testing, the computer generates cases you never imagined, often finding bugs in edge cases like empty strings, negative numbers, or Unicode characters you forgot existed.

Traditional Testing:

// Test specific cases
expect(add(2, 3)).toBe(5);
expect(add(0, 0)).toBe(0);
expect(add(-1, 1)).toBe(0);

Property-Based Testing:

// Test properties that should ALWAYS be true
property("addition is commutative", (a: number, b: number) => {
    return add(a, b) === add(b, a);
});

The beauty of property-based testing is that when it finds a failing case, it automatically "shrinks" the input to find the minimal failing case. If your function fails on a 100-element array, the framework will systematically reduce it to find that it actually fails on any array with more than 3 elements, making debugging much easier.

Why use it?

Finds edge cases you didn't think of: The framework generates hundreds of test cases
Better test coverage: Tests properties, not just examples
Discovers hidden assumptions: Often reveals bugs in boundary conditions
Documents behavior: Properties serve as executable specifications

When to use it:

Testing invariants and edge cases
Great for race condition detection
Focus on specific problematic patterns
Generate test cases automatically
Test with boundary conditions and edge cases
Verify properties hold across all possible inputs
Find counterexamples to assumptions
Test mathematical properties (commutativity, associativity, etc.)

Property-Based Testing with IO Schedulers for Reproducible Race Conditions

📁 Copy to: PROPERTY_TESTING_IO_SCHEDULERS.md

What are IO Schedulers?

IO Schedulers in property-based testing frameworks allow you to control the execution order of asynchronous operations deterministically. This makes race conditions reproducible and testable.

Imagine you're debugging a race condition that only appears in production once a week. You can't attach a debugger to production, and you can't reproduce it locally no matter how many times you run the test. This is where IO schedulers revolutionize concurrent testing. They turn non-deterministic bugs into deterministic ones by taking control of time itself—at least from your program's perspective.

The Problem:

Race conditions depend on timing
Traditional testing can't control async execution order
Bugs appear randomly and are hard to reproduce

The Solution:

IO schedulers intercept all async operations
They systematically try different execution orders
When a bug is found, they provide a seed to reproduce it

The magic happens through systematic exploration. Where traditional testing might run your concurrent code 1000 times and never hit the race condition, an IO scheduler methodically tries different orderings: What if Promise A resolves before Promise B? What if they resolve simultaneously? What if B completes while A is half-done? By exploring these possibilities systematically rather than randomly, IO schedulers can find race conditions that would take millions of random runs to encounter.

Why This Matters for AI-First Development:

Race conditions are among the hardest bugs to find—they typically surface only when you're deeply familiar with the code and can intuit where timing issues might lurk. This presents a challenge for vibe coding: when AI writes most of your code, you naturally have less intimate knowledge of its internals. IO schedulers fill this familiarity gap with systematic testing.

More importantly, implementing IO scheduler tests is notoriously fiddly and frustrating work. The setup is complex, the mental model is challenging, and debugging failures requires patience that most developers lack. But AI assistants process this kind of meticulous, detail-oriented work without the emotional friction humans experience. They methodically work through test setups requiring 50 lines of boilerplate. They systematically try different approaches without frustration. The cost of implementing comprehensive race condition testing drops dramatically when your AI assistant handles the tedious parts while you focus on identifying what concurrent behaviors need testing.

Fast-Check (JavaScript/TypeScript)

Fast-check provides a powerful scheduler for testing async race conditions:

import fc from 'fast-check';

describe('Race condition testing with fast-check', () => {
    it('detects race in concurrent counter updates', async () => {
        await fc.assert(
            fc.asyncProperty(
                fc.scheduler(),
                async (s) => {
                    // The scheduler controls all async operations
                    let counter = 0;
                    let updateCount = 0;
                    
                    // Define async operations
                    const increment = s.scheduleFunction(async () => {
                        const current = counter;
                        // This Promise resolution is controlled by scheduler
                        await s.schedule(Promise.resolve());
                        counter = current + 1;
                        updateCount++;
                    });
                    
                    // Run operations concurrently
                    await Promise.all([increment(), increment()]);
                    
                    // Property: counter should equal number of updates
                    return counter === updateCount;
                }
            ),
            { 
                verbose: true,  // Shows which scheduling caused failure
                seed: 42,       // Can reproduce exact failure
                numRuns: 100    // Try 100 different schedulings
            }
        );
    });
});

Key Features:

fc.scheduler() creates a controlled environment
scheduleFunction() wraps async functions
schedule() controls Promise resolution timing
Provides seed for reproducing failures

ScalaCheck with Cats Effect

ScalaCheck combined with Cats Effect provides a powerful approach for testing concurrent code. Cats Effect's IO monad gives you referential transparency and composable concurrency primitives. The example below shows how to use Ref (thread-safe mutable references) and Queue to test for race conditions:

import org.scalacheck.{Gen, Properties}
import org.scalacheck.Prop.forAll
import cats.effect._
import cats.effect.std.{Queue, Ref}
import cats.implicits._
import cats.effect.testing.scalatest.AsyncIOSpec
import scala.concurrent.duration._

object ConcurrentRaceTest extends Properties("Concurrent") {
    // Using Cats Effect IO for deterministic async testing
    implicit val runtime: IORuntime = IORuntime.global
    
    property("concurrent updates maintain consistency") = forAll { (operations: List[Int]) =>
        val test = for {
            counter <- Ref[IO].of(0)
            inconsistencies <- Ref[IO].of(0)
            queue <- Queue.unbounded[IO, IO[Unit]]
            
            // Create concurrent operations
            _ <- operations.traverse { _ =>
                IO {
                    for {
                        current <- counter.get
                        _ <- queue.offer(
                            counter.modify { value =>
                                if (value != current) {
                                    (value + 1, true) // Race detected
                                } else {
                                    (value + 1, false)
                                }
                            }.flatMap { raceDetected =>
                                if (raceDetected) inconsistencies.update(_ + 1)
                                else IO.unit
                            }
                        )
                    } yield ()
                }.start // Run concurrently
            }
            
            // Process all operations
            _ <- queue.size.iterateUntil(_ == operations.length)
            tasks <- queue.tryTakeN(None)
            _ <- tasks.sequence
            
            // Check results
            finalCount <- counter.get
            races <- inconsistencies.get
        } yield finalCount == operations.length && races == 0
        
        test.unsafeRunSync()
    }
}

This custom scheduler queues all async operations and executes them deterministically. By controlling when tasks run, you can systematically explore different interleavings and find race conditions that would be nearly impossible to discover through random testing.

Best Practices for IO Scheduler Testing

Start Simple: Test basic race conditions first
Use Seeds: Always save seeds that find bugs
Limit Scope: Test small units of concurrent code
Vary Timing: Test different delay patterns
Check Invariants: Focus on properties that should always hold

Comparison of Tools

Tool	Language	Approach	Best For
fast-check	JS/TS	Scheduler control	Async/Promise races
ScalaCheck + Cats Effect	Scala	IO monad + Ref/Queue	Functional concurrent systems

Example: Finding a Real Bug

// This test found a real race condition in a file system
it('detects file system race condition', async () => {
    await fc.assert(
        fc.asyncProperty(
            fc.scheduler(),
            fc.array(fc.tuple(fc.constant('write'), fc.string())),
            async (s, operations) => {
                const fs = new ConcurrentFileSystem();
                
                // Schedule all operations
                const promises = operations.map(([op, data]) => 
                    s.scheduleFunction(async () => {
                        if (op === 'write') {
                            await fs.write('test.txt', data);
                        }
                    })()
                );
                
                await Promise.all(promises);
                
                // Property: last write should win
                const content = await fs.read('test.txt');
                return content === operations[operations.length - 1][1];
            }
        )
    );
    // Failed with seed: 1337
    // Reproduction: Two writes overlapped, corrupting data
});

The key advantage of property-based testing with IO schedulers is reproducibility. When a race condition is found, you can reproduce it exactly using the seed, making debugging much easier than traditional "Heisenbugs" that disappear when you try to observe them.

Think about the implications: every race condition bug becomes as debuggable as a simple logic error. You can add logging, step through with a debugger, refactor the code, and know with certainty whether you've fixed the issue by running the test with the same seed. This transforms concurrent programming from a dark art into a science.

Visual Testing & Screenshots

📁 Copy to: VISUAL_TESTING.md

What is Visual Regression Testing?

Visual regression testing captures screenshots of your UI and compares them against baseline images to detect unintended visual changes.

Why use it?

Catch CSS bugs: Styling changes that break layouts
Cross-browser issues: Rendering differences
Responsive design: Ensure mobile views work
Component changes: Unintended side effects

Visual Regression Testing with AI

Visual regression testing ensures your UI looks correct across changes. In vibe coding, this is particularly valuable—your AI can implement UI changes rapidly, while visual tests catch subtle regressions you might miss. The AI can also review visual diffs and suggest whether changes are intentional improvements or bugs.

Effective AI pair programming works as a dialogue. You provide domain knowledge, business context, and architectural decisions. The AI handles implementation details, applies technical patterns, and generates boilerplate code. This division of labor enables higher development velocity than either could achieve alone.

Working with AI Assistants

Give AI access to development tools so it can monitor CI and fix issues
Provide clear context about current state and objectives
Use structured todo lists to track progress
Share failure logs and diagnostics for efficient debugging
Iterate in small chunks with frequent testing
Always validate AI suggestions through testing

AI-Assisted Debugging

Share complete error messages and stack traces
Provide relevant code context
Explain what was expected vs. actual behavior
Use AI to generate test cases for edge cases
Have AI suggest multiple solution approaches

⚠️ Critical Warning: AI Behavior When Struggling When AI assistants encounter difficult problems, they may try to:

Delete problematic code instead of fixing it

Skip failing tests rather than making them pass

Suggest workarounds that avoid the real issue

Give up on complex debugging challenges

Always push AI to keep working on the actual fix. Watch for signs like "let's simplify this" or "we can skip this test" - these are red flags that the AI is trying to avoid the hard problem. The correct response is to insist on solving the root cause, not working around it.

Collaborative Development Loop

Define clear objectives and success criteria
Break work into small, testable chunks
Run tests frequently to catch regressions early
Share results (successes and failures) with AI
Iterate based on feedback from tests and CI
Document lessons learned for future reference

📱 TypeScript/JavaScript Excellence

📁 Copy to: TYPESCRIPT_EXCELLENCE.md

TypeScript isn't just JavaScript with types—it's a different way of thinking about code. When used properly, TypeScript transforms runtime errors into compile-time errors, making entire categories of bugs impossible. The practices in this section aren't arbitrary rules; they're battle-tested patterns that maximize TypeScript's ability to catch errors before they reach production.

Package Management (TypeScript/JavaScript Specific)

npm's resolution algorithm can produce different results for the same package.json, leading to "works on my machine" issues. Yarn's deterministic algorithm ensures consistent dependencies across all environments.

ALWAYS use yarn, NEVER use npm
- Use yarn install instead of npm install
- Use yarn add instead of npm install
- Use yarn test instead of npm test
- Use yarn run instead of npm run

Testing Framework Rules (TypeScript/JavaScript Specific)

Vitest offers significant performance improvements over Jest through esbuild transformation and parallel test execution. It also shares configuration with Vite, reducing setup complexity.

ABSOLUTELY NO JEST - Use Vitest only (vi not jest)
Import test utilities from vitest, not jest
Use vi.fn() instead of jest.fn()
Use vi.mock() instead of jest.mock()
Use vi.spyOn() instead of jest.spyOn()

Test Commands

yarn test                           # Run all tests
yarn test -- path/to/test.file.tsx # Run specific test
yarn typecheck                     # TypeScript check
yarn lint                          # Lint check
yarn lint:fix                      # Fix lint issues
yarn typecheck && yarn lint        # Run all checks

Functional Programming Rules

Modern JavaScript engines optimize functional methods like map, filter, and reduce effectively. These methods provide clearer intent and reduce common loop-related bugs while maintaining performance.

NO function keyword - Use arrow functions only
NO for loops - Use .map(), .filter(), .reduce(), .forEach()
NO while/do-while loops - Use recursion or functional methods
NO for...in/for...of loops - Use Object.keys(), Object.values(), Object.entries()
Maximum function complexity: 10
Maximum function parameters: 4
Maximum function lines: 80
Prefer const over let, never use var
Use destructuring assignment
Use template literals over string concatenation

Array Safety (Critical for Race Condition Prevention)

📁 Copy to: ARRAY_SAFETY.md

Array access bugs are insidious because they often work fine in development, pass your unit tests, and then crash in production when data doesn't match your assumptions. The most dangerous pattern is assuming an array element exists before accessing its properties. This single assumption causes more production crashes than almost any other JavaScript pattern.

The solution isn't just defensive programming—it's leveraging TypeScript's noUncheckedIndexedAccess flag to make these bugs impossible. With this flag enabled, TypeScript forces you to handle the possibility that any array access might return undefined. It's like having a safety net that catches you before you fall.

ALWAYS check array element exists before accessing properties
Never do array[index].property without checking array[index] exists first

// Bad: Can crash with "Cannot read properties of undefined"
const item = match[1].length;

// Good: Safe access patterns
const item = match[1]?.length;                    // Optional chaining
const item = match[1]?.length ?? defaultValue;   // With default
const item = match[1] && match[1].length;        // Guard check

// Utility pattern
import { safeGet } from '@/utils/safeArray';
const item = safeGet(array, index, defaultItem).property;

TypeScript Configuration for Safety

Enable strict mode settings in tsconfig.json:

{
  "compilerOptions": {
    "strict": true,
    "noUncheckedIndexedAccess": true,  // KEY: Forces undefined checks on array access
    "strictNullChecks": true,
    "strictPropertyInitialization": true,
    "noImplicitAny": true,
    "noImplicitThis": true,
    "useUnknownInCatchVariables": true
  }
}

Branded Types for ID Safety

// Prevent mixing different ID types
type SessionId = string & { _brand: 'SessionId' };
type RequestId = string & { _brand: 'RequestId' };
type ProcessId = number & { _brand: 'ProcessId' };

// Helper functions
const SessionId = (id: string): SessionId => id as SessionId;
const RequestId = (id: string): RequestId => id as RequestId;
const ProcessId = (id: number): ProcessId => id as ProcessId;

Result Types for Error Handling

type Result<T, E = Error> = 
  | { ok: true; value: T }
  | { ok: false; error: E };

interface IWebSocketService {
  send: (message: string) => Promise<Result<void, WebSocketError>>;
  onMessage: (handler: (message: ClaudeMessage) => void) => void;
}

Exhaustive Message Handling

const handleMessage = (message: ClaudeMessage): void => {
  switch (message.type) {
    case 'ClaudeOutput':
      handleOutput(message);
      break;
    case 'ClaudeSessionUpdate':
      handleSessionUpdate(message);
      break;
    // ... handle all cases
    default:
      // Ensures all cases are handled at compile time
      const _exhaustive: never = message;
      throw new Error(`Unhandled message type: ${(_exhaustive as any).type}`);
  }
};

🦀 Rust Excellence

📁 Copy to: RUST_EXCELLENCE.md

Rust changes how you think about programming. Its ownership system isn't just about memory safety—it's a new mental model that makes concurrent programming safe by default. The borrow checker isn't an annoyance to work around; it's a teacher that shows you where your design has hidden complexity. Embracing Rust means embracing its philosophy: if it compiles, it probably works correctly.

Functional Programming Principles

Rust's ownership system naturally pushes you toward functional programming. When mutation requires explicit permission and sharing requires careful thought, you naturally write more pure functions. Embrace this—Rust is trying to teach you something.

Immutability by default: In Rust, everything is immutable unless you explicitly ask for mut. This isn't a limitation—it's liberation from an entire class of bugs.
Result and Option everywhere: Rust doesn't have null or exceptions. Instead, it has types that make failure explicit and impossible to ignore. This transforms runtime errors into compile-time errors.
Side effects are visible: Any function that can perform I/O or mutate state shows this in its signature. You can't hide side effects in Rust—they're part of the contract.
Composition over inheritance: Rust doesn't have inheritance because it doesn't need it. Traits and generics provide more powerful composition patterns without the fragility of inheritance hierarchies.

Rust-Specific Quality Standards

These aren't arbitrary rules—each one prevents real bugs that have bitten Rust developers. Following these standards is the difference between fighting the borrow checker and dancing with it.

Error Handling Excellence:

Never .unwrap() in production code: Every .unwrap() is a potential panic waiting to crash your program. Use .expect() only in truly impossible cases, and even then, consider if the "impossible" might happen.
Custom error types tell stories: Don't use generic errors. Create specific error types that explain what went wrong and how to fix it. The thiserror crate makes this painless.

Documentation as Contract:

Every public item needs /// docs: If it's pub, it needs documentation. This isn't bureaucracy—it's a contract with your users (including future you).
Examples in docs: The best documentation includes examples. Rust even tests these examples, ensuring they stay current.

Performance Without Sacrifice:

&str vs String: Accept &str parameters when you just need to read. Only require String when you need ownership. This simple rule eliminates unnecessary allocations.
const for compile-time computation: If it can be computed at compile time, make it const. The compiler becomes your calculator.

Type System Mastery:

#[derive] liberally: Debug, Clone, PartialEq—these traits make your types useful. Deriving them costs nothing at runtime.
impl Trait for ergonomics: Return impl Iterator<Item = T> instead of Box<dyn Iterator<Item = T>>. It's faster and clearer.

Tools That Teach:

cargo clippy is your mentor: Clippy doesn't just find bugs—it teaches idiomatic Rust. Its suggestions will make you a better Rust developer.
Naming conventions matter: snake_case for functions, PascalCase for types, SCREAMING_SNAKE_CASE for constants. Consistency aids readability.

Rust Commands

These commands form your Rust development rhythm. Run them frequently—they're designed to catch problems early when they're easy to fix.

# Format code - Rust's formatter is unopinionated and consistent
cargo fp-format          

# FP-friendly lints - Catches anti-patterns and suggests functional alternatives
cargo fp-check           

# Run tests - Includes doc tests, unit tests, and integration tests
cargo fp-test            

# Security audit - Checks dependencies for known vulnerabilities
cargo audit              

# Run everything - Format, lint, test, audit in one command
make rust-quality

Pro tip: Set up pre-commit hooks to run these automatically. Rust development is smoothest when you maintain quality continuously rather than fixing issues in bulk.

📋 Code Review Checklist

📁 Copy to: CODE_REVIEW_CHECKLIST.md

Array Access Review

All array access uses optional chaining or null checks
Array length comparisons before parallel access
No assumptions about array synchronization
Defensive defaults for undefined values

Async State Review

Effect cleanup for async operations
State updates check if component is mounted
Loading states prevent premature rendering
Error boundaries handle unexpected states

Performance Review

No unnecessary re-renders from race conditions
Debounced/throttled rapid updates
Memoization where appropriate

Type Safety Review

Branded types for IDs prevent mixing
Result types for explicit error handling
Exhaustive switch statements with never type
No any types without justification

🐛 Bug-Driven Type Safety Improvement

📁 Copy to: BUG_DRIVEN_TYPES.md

When to use this approach: After encountering runtime errors, logic bugs, or unexpected behavior in production. This reactive strategy complements your baseline type safety (like noUncheckedIndexedAccess) by evolving your type system based on real-world failures. While baseline settings prevent common bugs, this approach creates custom type guards and patterns specific to your domain's edge cases.

The Golden Rule: Every Bug is a Type Safety Lesson

When you find a bug, ALWAYS ask: "How could stricter types have caught this?"

This practice turns every debugging session into a learning opportunity that strengthens your entire codebase.

Bug Analysis Framework

Step 1: Categorize the Bug

Runtime Error: Crashed with undefined/null access, type mismatch
Logic Error: Wrong behavior, incorrect data flow
Race Condition: Timing-dependent failure
Integration Error: Component interaction failure

Step 2: Trace to Type Weakness

Ask these questions:

Could branded types have prevented ID confusion?
Would noUncheckedIndexedAccess have caught array access issues?
Could discriminated unions have enforced correct state handling?
Would Result types have made error handling explicit?
Could stricter function signatures have caught this?

Step 3: Implement Type Improvements

Don't just fix the bug - improve the types to prevent similar bugs.

Real-World Examples from This Project

Example 1: Array Index Race Condition

The Bug: TypeError: Cannot read properties of undefined (reading 'length')

// Buggy code in FileViewer.tsx
const lineTokens = hasHighlighting ? highlightedTokens[index] : [];
// Later: lineTokens.length - CRASH when highlightedTokens[index] is undefined

Root Cause: Array access without bounds checking in race condition

Type Safety Analysis:

// How stricter types would have caught it:

// 1. With noUncheckedIndexedAccess: true
const lineTokens = hasHighlighting ? highlightedTokens[index] : [];
//                                  ^^^^^^^^^^^^^^^^^^^^ 
// Type error: Type 'HighlightedToken[] | undefined' is not assignable to type 'HighlightedToken[]'

// 2. Forces defensive programming:
const lineTokens = hasHighlighting ? (highlightedTokens[index] ?? []) : [];

Type Safety Improvements Applied:

✅ Added noUncheckedIndexedAccess: true to tsconfig.json
✅ Created SafeArray utility functions
✅ Added ESLint rules for unsafe array access
✅ Added fast-check property tests for array synchronization

Example 2: Request-Response Correlation Bug

The Bug: Claude service resolves wrong request when multiple concurrent requests

Root Cause: String-based request matching without correlation IDs

Type Safety Analysis:

// Buggy pattern - requests identified by type only
pendingRequests.get(messageType)?.resolve(response);

// How branded types + proper correlation would catch it:
type RequestId = string & { _brand: 'RequestId' };

interface BaseRequest {
  id: RequestId;
  type: string;
}

interface BaseResponse {
  requestId: RequestId;  // Forces correlation
  type: string;
}

// Now TypeScript forces proper request/response matching
const pendingRequest = pendingRequests.get(response.requestId);

Type Safety Improvements Applied:

✅ Added branded types for IDs
✅ Created discriminated unions for requests/responses
✅ Added request correlation patterns

Example 3: Navigation State Bug

The Bug: Timeline clicks didn't open files due to setTimeout race condition

Root Cause: Loose typing allowed any navigation state

Type Safety Analysis:

// Weak typing allowed bugs
onEntryPress: (entry: TimelineEntry) => void  // No guarantee navigation happens

// Stronger typing would enforce navigation action
type NavigationAction = 
  | { type: 'OPEN_FILE'; filePath: string }
  | { type: 'SHOW_HISTORY'; entry: TimelineEntry };

type TimelineEntryHandler = (entry: TimelineEntry) => NavigationAction;

// TypeScript now enforces that clicking produces a navigation action

Type Safety Improvements Applied:

✅ Added NavigationAction discriminated unions
✅ Made callbacks return explicit actions
✅ Added tests to verify navigation behavior

Bug-Driven Type Safety Checklist

When you find any bug, systematically check:

Array & Object Access

Could noUncheckedIndexedAccess have caught this?
Should we use optional chaining (?.) everywhere?
Are we making assumptions about array lengths?
Could SafeArray utilities prevent this class of bugs?

Function Signatures

Are parameters too permissive (any, object, string)?
Could branded types prevent ID confusion?
Should return types be more specific?
Would Result types make errors explicit?

State Management

Could discriminated unions enforce valid state transitions?
Are we using unions where we should use intersections?
Would readonly types prevent unintended mutations?
Could state machines make invalid states unrepresentable?

Async Operations

Are Promise types specific enough?
Could we use branded types for different async operations?
Would cancellation tokens prevent race conditions?
Are error types explicit and actionable?

Component Props

Are callback types specific about what they return?
Could we use discriminated unions for different component modes?
Are we properly typing children and render props?
Would stricter event handler types help?

Implementation Pattern: Bug → Type → Test

// 1. Fix the immediate bug
const lineTokens = hasHighlighting ? (highlightedTokens[index] ?? []) : [];

// 2. Add type safety to prevent similar bugs
// Enable noUncheckedIndexedAccess in tsconfig.json

// 3. Create utility to make safe pattern easy
export const safeArrayAccess = <T>(arr: T[], index: number, fallback: T): T => 
  arr[index] ?? fallback;

// 4. Add test that would have caught the original bug
it('should handle token array shorter than lines array', () => {
  const content = 'line1\nline2\nline3';  // 3 lines
  const tokens = [['token1']];            // 1 token
  
  // This should not crash
  expect(() => renderFileViewer(content, tokens)).not.toThrow();
});

// 5. Add property test for this class of bugs
it('should handle mismatched array lengths', () => {
  fc.assert(
    fc.property(
      fc.array(fc.string()),          // lines
      fc.array(fc.array(fc.string())), // tokens
      (lines, tokens) => {
        // Property: Should never crash regardless of array lengths
        expect(() => renderFileViewer(lines, tokens)).not.toThrow();
      }
    )
  );
});

Type Safety Evolution Log

Keep a log of how each bug improved your type safety:

## Bug #47: FileViewer Array Access Crash
- **Date**: 2025-06-20
- **Bug**: `highlightedTokens[index]` was undefined, caused crash
- **Type Fix**: Added `noUncheckedIndexedAccess: true`
- **Tools Added**: SafeArray utilities, ESLint rules
- **Tests Added**: Property tests for array sync
- **Prevention**: All array access now type-safe

## Bug #52: Request Correlation Mix-up  
- **Date**: 2025-06-19
- **Bug**: Wrong request resolved in concurrent scenario
- **Type Fix**: Added branded RequestId type
- **Tools Added**: Discriminated unions for req/res
- **Tests Added**: Concurrent request tests
- **Prevention**: TypeScript now enforces correlation

Advanced Type Safety Patterns

Phantom Types for State Safety

type FileState = 'closed' | 'opening' | 'open' | 'modified';
type File<S extends FileState> = {
  path: string;
  state: S;
  content: S extends 'open' | 'modified' ? string : undefined;
};

// TypeScript enforces you can only read content from open files
const readContent = (file: File<'open' | 'modified'>): string => file.content;

Template Literal Types for String Safety

type FilePath = `/${string}`;
type GitBranch = `refs/heads/${string}`;

// Prevents accidental string mixing
const openFile = (path: FilePath) => { /* ... */ };
openFile('/src/app.ts');        // ✅ OK
openFile('src/app.ts');         // ❌ Type error - missing leading slash

Recursive Types for Complex Validation

type JSONValue = 
  | string 
  | number 
  | boolean 
  | null
  | { [key: string]: JSONValue }
  | JSONValue[];

// Now JSON.parse return can be properly typed
const parseJSON = (str: string): JSONValue => JSON.parse(str);

The Type Safety Mindset

Every bug teaches us about a gap in our type system. By systematically improving types after each bug, we build software that becomes progressively more robust and self-documenting.

How to Apply This Mindset

When You Hit "Cannot read property of undefined":

// Step 1: Find the crash site
const name = user.profile.name; // 💥 Crashes if profile is undefined

// Step 2: Update the type to reflect reality
type User = {
  profile?: {  // Make it optional in the type
    name: string;
  };
};

// Step 3: TypeScript now forces safe access
const name = user.profile?.name ?? 'Anonymous';

When You Have Invalid State Combinations:

// Before: These types allow invalid states
type Order = {
  status: 'pending' | 'paid' | 'shipped';
  paymentId?: string;
  trackingNumber?: string;
};

// After: Make invalid states impossible
type Order = 
  | { status: 'pending' }
  | { status: 'paid'; paymentId: string }
  | { status: 'shipped'; paymentId: string; trackingNumber: string };

When You Catch Exceptions:

// Before: Exceptions are hidden
function parseConfig(json: string): Config {
  return JSON.parse(json); // Can throw!
}

// After: Make errors explicit in types
function parseConfig(json: string): Result<Config, ParseError> {
  try {
    return { ok: true, value: JSON.parse(json) };
  } catch (e) {
    return { ok: false, error: new ParseError(e.message) };
  }
}

When You Add Runtime Validation:

// Before: Validation separate from types
function processAge(age: number) {
  if (age < 0 || age > 150) throw new Error('Invalid age');
  // ...
}

// After: Encode validation in types
type Age = number & { __brand: 'Age' };

function createAge(value: number): Age | null {
  if (value < 0 || value > 150) return null;
  return value as Age;
}

function processAge(age: Age) {
  // No validation needed - type guarantees validity
}

Implementation Checklist

Enable noUncheckedIndexedAccess in tsconfig.json
Replace optional fields with discriminated unions where possible
Convert throwing functions to return Result types
Create branded types for validated values
Document each type improvement in your TYPE_SAFETY_LOG.md

🔧 Development Workflow

📁 Copy to: DEVELOPMENT_WORKFLOW.md

Standard Development Process

Enter nix environment: nix-shell
Install dependencies: yarn install (frontend) / cargo check (backend)
Start development: yarn start (frontend) / cargo run (backend)
Run tests: yarn test (frontend) / cargo fp-test (backend)
Quality checks: yarn typecheck (frontend) / make rust-quality (backend)
CRITICAL: Push changes and monitor CI until green

Build Commands

# Frontend
yarn start              # Development
yarn build             # Production build
yarn storybook         # Component development
yarn build-storybook   # Storybook build

# Testing
yarn test              # Unit tests
yarn test:integration  # Integration tests
yarn test:fullstack   # Full-stack tests
yarn test:e2e:docker   # Docker E2E tests

# Quality
yarn typecheck         # TypeScript check
yarn lint             # Lint check
yarn lint:fix         # Fix lint issues

🚀 AI Pair Programming Best Practices

📁 Copy to: AI_PROGRAMMING.md

Working with AI Assistants

Setting Up for Success:

Give AI access to GitHub: Connect your AI to your repository and CI/CD. When it can see your failing tests and read your CI logs, it can fix issues autonomously while you focus on design decisions.
Provide clear context: Start each session with "Here's what we're building and why." The AI needs to understand not just the task, but the purpose behind it.
Use structured todo lists: AI assistants follow structured instructions effectively. A good todo list transforms AI from a simple code generator into a systematic execution engine.
Share failure logs and diagnostics: Don't just say "it's broken"—paste the full error. Your AI can often spot the issue in seconds when given complete information.
Iterate in small chunks: Big commits are hard to review and debug. Ask for small, focused changes that you can verify immediately.
Always validate AI suggestions: Trust but verify. The AI might solve your problem in a way that creates three new problems. Test everything.

Autonomous AI Prompts for Each Loop

📁 Copy to: AI_PROMPTS.md

Start your AI in autonomous mode with these prompts:

# CI Green Loop
"The CI is failing. Analyze the errors and fix them one by one until all checks pass."
"Monitor our CI and proactively fix any failures that occur."

# User Story Loop  
"Generate user stories for [feature area] and implement them test-first."
"Act as both user and developer to improve [component]'s UX."

# Bug Investigation
"Here's a bug report: [details]. Investigate, find root cause, fix it, and add prevention."
"Analyze our recent bugs and strengthen types to prevent similar issues."

# Performance Loop
"Profile [component/page] and optimize any bottlenecks you find."
"Our build takes too long. Analyze and optimize until it's under 5 minutes."

# UI Improvement  
"Review [component] for accessibility and usability issues. Fix what you find."
"Continuously improve our UI. Start with the most-used components."

# Code Quality
"Run a quality sweep on [folder]. Fix all issues you find."
"Refactor [module] to reduce complexity while maintaining all tests."

# Meta-Loop
"Analyze my development patterns this week and optimize our process."
"Which of our loops are inefficient? Suggest improvements."

AI Autonomy Patterns

Pattern 1: The Daily Standup

Human: "Start our daily standup"

AI: "Checking project status...

CI Status: 🔴 Failed

TypeScript error in Button.test.tsx (line 45)
Shall I fix this first?

Open PRs: 3

#123: Add user authentication (ready for review)
#124: Fix memory leak (CI failing)
#125: Update dependencies (approved, ready to merge)

Today's Priorities:

Fix CI (est. 15 min)
Merge approved PR
Review authentication PR
Continue with user dashboard feature

Shall I start with fixing CI?"

Pattern 2: The Bug Sweep

Human: "Do a bug sweep on recent issues"

AI: "Analyzing last 10 bug reports...

Pattern Detected: Array Access Errors (3 instances)

All related to unchecked array indexing. I'll:

Enable noUncheckedIndexedAccess
Fix resulting type errors
Add safety utilities

Starting now..."

Pattern 3: The Feature Builder

Human: "Build search functionality for our file explorer"

AI: "I'll build this using TDD. Let me break it down:

Write test for basic filename search
Implement minimal solution
Add test for regex patterns
Enhance implementation
Add test for search history
Implement with localStorage
Create UI with accessibility

Starting with the first test..."

AI-Assisted Debugging

📁 Copy to: AI_DEBUGGING.md

Debugging with AI is like having a powerful pattern-matching system that has seen millions of error messages but needs you to provide the specific context. The more precise details you share, the more accurate the AI's reasoning will be.

The Debugging Dance:

Share complete error messages: Not just the error type, but the full stack trace. That line number buried in the stack often holds the key.
Provide relevant code context: Include the failing function and its callers. The bug might be in how the function is used, not the function itself.
Explain expected vs. actual: "It should return an array of users, but it's returning undefined." This gap analysis helps the AI understand the problem space.
Let AI generate edge case tests: Ask "What inputs might break this function?" AI's pattern recognition can identify edge cases across a broader range of possibilities than human intuition typically covers.
Request multiple solutions: "Give me three ways to fix this." Often the second or third approach is better than the obvious first solution.

⚠️ Critical Warning - Watch for AI Avoidance Behavior:

When AI encounters genuinely difficult problems, it sometimes tries to escape rather than solve. You'll see this pattern:

Suggests deleting the problematic code: "This function is too complex, let's remove it and use a simpler approach"
Tries to skip failing tests: "This test seems flaky, we could skip it for now"
Proposes workarounds instead of fixes: "Instead of fixing this race condition, we could just add a delay"

How to handle this:

AI: "This component is causing too many issues. We could simplify by removing the concurrent processing..."

You: "No, we need the concurrent processing. Let's debug why it's failing. Show me what's happening step by step."

AI: "The test for race conditions keeps failing. We could mark it as skip..."

You: "No, the test is catching a real bug. Let's use fast-check's scheduler to make it deterministic."

Remember: The AI works for you, not the other way around. When it tries to avoid hard problems, redirect it back to solving the root cause. Some of the best breakthroughs come from pushing through difficult bugs rather than working around them. Your job is to be the technical lead who says "we're going to solve this properly" when the AI wants to take shortcuts.

Collaborative Development Loop

The best AI programming sessions feel like pair programming with a really fast typist. You handle the strategy, the AI handles the tactics, and together you move faster than either could alone.

The Rhythm of AI Collaboration:

Define clear objectives: "We need to add user authentication using JWT tokens" is better than "add login functionality." Specificity unlocks AI potential.
Break work into small, testable chunks: "First, create the user model. Then, add password hashing. Next, implement the login endpoint." Each chunk should be verifiable.
Run tests frequently: After every AI-generated change, run your tests. Catching issues immediately is infinitely easier than debugging a large batch of changes.
Share results transparently: "The login endpoint works, but the test for expired tokens is failing with this error: [paste error]." Good or bad, share what happened.
Iterate based on feedback: Use test results and CI status to guide next steps. Let reality, not plans, drive your development.
Document lessons learned: When you discover something non-obvious, add it to your project's CLAUDE.md. Your AI assistant's effectiveness compounds with better documentation.

🎯 Success Metrics

📁 Copy to: SUCCESS_METRICS.md

Success in modern software development isn't just about shipping features—it's about maintaining velocity while increasing quality. These metrics aren't arbitrary numbers; they're indicators of a healthy codebase and a productive team. When these metrics are green, you can move fast with confidence. When they start slipping, they're early warning signs that technical debt is accumulating.

Code Quality

Type Safety: Zero any types, strict TypeScript config
Test Coverage: 100% line coverage (excluding unreachable), comprehensive E2E tests
Linting: Zero errors, consistent code style
Performance: Fast build times, responsive UI

Development Velocity

CI Pipeline: <15 minutes total time
Test Reliability: <1% flaky test rate
Bug Detection: 95% caught before production
Developer Experience: Quick feedback loops

Collaboration Quality

AI Integration: Efficient pair programming sessions
Documentation: Clear, actionable best practices
Knowledge Sharing: Lessons learned captured and applied
Continuous Improvement: Regular retrospectives and updates

🔄 Continuous Improvement

📁 Copy to: CONTINUOUS_IMPROVEMENT.md

The only constant in software development is change. What works today might be obsolete tomorrow. Continuous improvement isn't just about fixing what's broken—it's about questioning what works and finding ways to make it better. It's the difference between a codebase that gets harder to work with over time and one that becomes more pleasant and productive.

Regular Practices

Software development is like tending a garden—daily attention prevents weekly crises. These practices aren't bureaucracy; they're the habits that keep your codebase healthy and your team productive. Skip them, and you'll spend your time fighting fires instead of building features.

Weekly Rituals That Compound:

Review and update best practices weekly: Your understanding evolves with every bug fixed and feature shipped. Capture these learnings in your documentation. Friday afternoons are perfect for this—reflect on the week's lessons while they're fresh.
Add new tests for each feature: Not after. Not "when you have time." During. Every feature should arrive with its own test suite, like a product with batteries included. This isn't extra work—it's how you know you're done.
Refactor tests to reduce duplication: Test code is code. Duplicate test code is technical debt. When you see the same setup in three tests, extract it. When you copy-paste assertions, create helpers. Clean tests are easier to understand and maintain.
Monitor test execution times: A slow test suite is a test suite that doesn't get run. Track your test times weekly. When they creep up, investigate. That 30-second test suite that becomes 5 minutes? It'll kill your development velocity.
Archive old artifacts after 30 days: Screenshots, test reports, build artifacts—they accumulate like digital dust. Set up automated cleanup. Your CI shouldn't fail because the disk is full of month-old screenshots nobody will ever look at.

Learning from Failures

Always investigate root cause of failures
Update practices based on lessons learned
Share knowledge across team/project
Improve tooling to prevent similar issues
Test the fixes to ensure they work

Metrics to Track

CI pipeline health and speed
Test coverage and reliability
Bug discovery rate by testing phase
Developer productivity and satisfaction
AI pair programming effectiveness

🏆 Checklist

📁 Copy to: PROGRESS_CHECKLIST.md

Use this checklist to track your progress implementing these practices:

The Checklist

Zero any types - Every type is explicit and meaningful
CI runs in under 15 minutes - Fast feedback on every commit
100% test coverage - Essential for verifying AI-generated code (excluding unreachable)
Zero skipped tests - Every test runs and passes
All arrays accessed safely - noUncheckedIndexedAccess: true
Race condition tests for async operations - Using property-based testing
AI has helped review your UI - Fresh eyes on every feature
Living specs that evolve - Documentation that stays current
You know which loop you're in - Always working with intention

Bonus Achievements

Your AI can explain your entire architecture - Because specs are complete
New developers productive in < 1 day - Thanks to clear practices
Production bugs down 90% - Most bugs now impossible
You've contributed back - Share your learnings with others

Share Your Journey

Which practice had the biggest impact?
What was hardest to implement?
What would you add to this guide?

These practices help you build reliable software quickly with AI assistance. Select the ones that fit your workflow and adapt them to your needs.

Remember: You're always in a loop. Choose the right one.

This document represents battle-tested practices from a real-world project using TypeScript, React Native, Rust, and AI pair programming. These practices evolved through iterative development, extensive testing, and continuous improvement based on actual challenges faced during development.

snoble/vibe-coding-best-practices-for-one.md

Vibe Coding Best Practices

🎯 The Most Important Thing: Choose Your Loop

📋 What's Covered Here

🗺️ Quick Navigation

📚 Table of Contents

🔄 AI-Driven Operating Loops

🟢 The CI Green Loop

How to Initiate This Loop

The Loop Steps

1. Check Status (1-2 min)

2. Identify Root Cause (2-5 min)

3. Fix Locally (5-10 min)

4. Verify Fix (2-3 min)

5. Push & Monitor (1-2 min)

Success Metrics

When to Use

Common Anti-patterns to Avoid

Template for Your CI Green Loop Log

🎯 The User Story → Test → UI Loop

The Loop Steps

1. Generate User Story with AI (5-10 min)

2. Create E2E Test First (10-15 min)

3. Watch Test Fail (2-3 min)

4. Build Minimal UI (15-20 min)

5. Verify Test Passes (2-3 min)

6. AI User Review (5-10 min)

7. Iterate Based on Feedback (10-15 min)

Success Metrics

When to Use

Common Patterns

Template for Your User Story Loop Log

Implementation

User Feedback

Next Iteration

🐛 The Bug Investigation Loop

The Loop Steps

1. Reproduce Reliably (5-10 min)

2. Trace Root Cause (10-20 min)

3. Fix the Immediate Issue (5-10 min)

4. Improve Type Safety (5-10 min)

5. Add Comprehensive Tests (5-10 min)

6. Clean Up Failed Attempts (2-5 min)

7. Document the Learning (3-5 min)

Success Metrics

When to Use

Common Bug Patterns & Prevention

Template for Your Bug Investigation Log

4. Prevention

5. Lessons Learned

🚀 The Performance Optimization Loop

The Loop Steps

1. Measure Baseline (10-15 min)

2. Identify the Bottleneck (15-20 min)

3. Form Hypothesis (5-10 min)

4. Implement Fix (20-30 min)

5. Measure Impact (10-15 min)

6. Document & Iterate (5-10 min)

Success Metrics

When to Use

Performance Budget Template

Template for Your Performance Optimization Log

5. Results

6. Lessons Learned

🎨 The UI Improvement Loop

The Loop Steps

1. AI Analyzes Current UI (5-10 min)

2. AI Generates User Stories (5-10 min)

3. AI Creates Tests First (10-15 min)

4. AI Implements Improvements (15-20 min)

5. AI Reviews Own Implementation (5-10 min)

6. Document and Loop (5 min)

Success Metrics

When to Use

Setting Up Autonomous UI Improvement

Template for Your UI Improvement Log

🧪 The Test-Driven Feature Loop

How to Start the Loop

The Autonomous Loop Steps

1. AI Writes Failing Test