test-driven development workflow
Tests are living documentation. They describe what the product does, not how the code is wired internally. A test that breaks on a refactor (while behavior is unchanged) is a bad test -- it was coupled to implementation, not behavior.
Test observable behavior from the user/caller's perspective. Do not test internal mechanics.
Ask before writing any test:
| Test Type | Example | Verdict |
|---|---|---|
| Behavior | "rejected tool call shows error in red" | Good -- tests what user sees |
| Behavior | "CJK character at buffer edge stays in bounds" | Good -- tests an invariant |
| Surface | "reducer calls replace(state, field=X)" | Bad -- coupled to implementation |
| Surface | "_process_queue is called 3 times" | Bad -- tests internal wiring |
Surface tests verify that code does what it already says it does. They add no confidence and break on every refactor. They reflect the current implementation, not the product design intention.
Behavior tests verify what the system promises to its users/callers. They survive refactoring because they are anchored to outcomes, not code paths.
Understand: What product behavior is this test validating?
Root Cause: Why failing?
Fix Hierarchy (preference order):
Remember: Test failures are signals to investigate, not obstacles to bypass.