Define the Test-Driven Development pattern with failure matrix planning. TDD ensures every implementation has an objective “done” definition, and every test validates requirements – not implementation details.
Before writing any implementation code. After the spec and design are finalized (SDD Step 4).
Phase 1: Define "what is correct" (Spec)
|
Phase 2: Write tests proving "it is currently wrong" (Red)
|
Phase 3: Write implementation to make tests pass (Green)
|
Phase 4: Clean up (Refactor -- change structure, not behavior)
Phase 2 and Phase 3 must not be reversed.
When implementation exists first, test authors tend to write tests that “pass for this implementation” rather than “validate the requirement.” Such tests only verify “the code does what it does,” not “the code does what it should.”
Without red tests, there is no objective completion criteria. Developers can keep adding features, keep “optimizing,” and never know when to stop.
With red tests: tests go from red to green one by one. “Done” means all reds are green. Clear, objective, unambiguous.
When code already exists, tests must understand implementation details to test it. Tightly coupled tests break when implementation changes, making test maintenance expensive. Expensive tests get deleted.
Before writing any test, produce a failure matrix analysis.
| Failure Point | Red Test Name | Expected Error/Result |
|---|---|---|
| Unknown command dispatched | bridge-unknown-cmd-throws |
Error: Not implemented: unknown_cmd |
| State reset restores defaults | store-reset-restores-defaults |
items.length === 2 |
| Production env without backend | bridge-prod-no-backend-throws |
Error: NotInBackendError |
| License activation changes state | license-activate-updates-status |
status === "valid" |
Get confirmation that the failure matrix is complete. Missing failure points here means missing tests later.
Dispatch test writing with explicit instruction: write tests only, do not write implementation.
// Example: red test for environment routing
test('production environment without native backend throws error', async () => {
// Setup: NODE_ENV=production, no native backend
// Execute: safeInvoke('list_items')
// Verify: throws NotInBackendError
})
test('development mock environment dispatches to mockInvoke', async () => {
// Setup: NODE_ENV=development, no native backend
// Execute: safeInvoke('list_items')
// Verify: mockInvoke is called, returns mock data
})
test('native environment uses real invoke', async () => {
// Setup: native backend present (mock isNativeEnv = true)
// Execute: safeInvoke('list_items')
// Verify: real invoke is called
})
npm test # or: pnpm test / cargo test
All tests should fail. If any test passes without implementation, either:
Now dispatch implementation work. The goal is narrow: make each failing test pass.
npm test
All tests should pass. If any test still fails, fix the implementation (not the test).
Change structure without changing behavior. Tests stay green throughout.
Layer 1: Unit Tests
Speed: milliseconds
Coverage: individual functions, modules, components
Tools: Vitest / Jest (frontend), cargo test / pytest (backend)
Layer 2: Integration Tests
Speed: seconds
Coverage: multi-module interaction, API communication simulation
Tools: Vitest + mock backend / Supertest
Layer 3: End-to-End Tests
Speed: minutes (requires full application)
Coverage: complete user flows
Tools: Playwright / Cypress / native E2E
The failure matrix maps to Layer 1 and Layer 2:
Failure Matrix (design phase)
|
Layer 1 red tests (unit -- single functions)
Layer 2 red tests (integration -- multi-module interaction)
|
Write implementation (make reds turn green)
|
Layer 3 (E2E -- final acceptance)
Do not chase coverage percentage. 100% coverage with tests written after implementation may all be conforming to the code rather than validating requirements.
Measure these instead:
| Indicator | Question |
|---|---|
| Failure matrix coverage | Does every failure scenario have a corresponding test? |
| Independence | Can each test run independently (no dependency on execution order)? |
| Readability | Does the test name clearly state what it validates? |
| Speed | Do Layer 1 + Layer 2 complete within 30 seconds? |
| Stability | Is the test deterministic (same code, same result 10/10 times)? |
SDD (Spec) defines "what is correct"
|
TDD Phase 2 (Red) converts "correct" into executable verification
|
TDD Phase 3 (Green) makes implementation satisfy verification
|
Analyze confirms spec and implementation have not diverged
Three tools serving one goal: give “done” an objective definition.
“TDD makes development slower”
Surface: extra time writing tests. Reality:
“Only complex code needs tests”
Simple code can break during refactoring too. Tests exist not because code is hard, but because you want it to stay correct in the future.
“Higher coverage is always better”
Coverage written after implementation may all be conforming tests. What matters is whether tests validate requirements. The failure matrix is more meaningful than coverage percentage.