Enforces a disciplined 4-phase pipeline for non-trivial coding tasks: Plan (hypothesis) → Code (one fix) → Validate (root cause) → Debug (max 3 tries, escalate). Prevents blind patching, symptom fixes, and retry loops. Activate for any bug fix, feature implementation, refactor, or error investigation that isn't a trivial one-line change.
A disciplined 4-phase workflow for any non-trivial coding task. Each phase has a clear purpose, explicit exit criteria, and a loop-back rule when things go wrong. The phases exist because AI agents' default failure mode is blind iteration: edit → build → edit → build → give up. This skill forces hypothesis-driven work, one-fix-at-a-time discipline, root-cause verification, and bounded debugging.
This is a rigid skill — follow the phases exactly. Do not skip, merge, or reorder.
Every non-trivial task — bug fix, feature, refactor — goes through all 4 phases in order:
┌─────────────┐ ┌──────────┐ ┌───────────────┐ ┌─────────────┐
│ 1 PLANNER │───▶│ 2 CODER │───▶│ 3 VALIDATOR │───▶│ 4 DEBUGGER │
│ hypothesis │ │ one fix │ │ build + root │ │ max 3 tries │
└─────────────┘ └────┬─────┘ └───────┬───────┘ └──────┬──────┘
▲ │ │ │
│ │ unclear cause │ fails │ new hypothesis
└────────────────┴──────────────────┴───────────────────┘
loop back to PLANNER
Skipping a phase or jumping straight to Phase 2 is the failure mode this skill prevents.
| Situation | Active Phase | Exit When |
|---|---|---|
| New task arrives | Phase 1 Planner | Hypothesis written, scope defined, success criteria explicit |
| Hypothesis validated | Phase 2 Coder | One focused change applied, no unrelated edits |
| Change applied | Phase 3 Validator | Build passes AND root cause verified |
| Validator fails | Phase 4 Debugger | Either fix found (→ Phase 2) or 3 attempts exhausted (→ escalate) |
| Unclear cause mid-fix | Back to Phase 1 | New hypothesis written |
| Fix introduces new error | Back to Phase 1 | Hypothesis was wrong |
Goal: Understand the task and formulate an explicit hypothesis before any code change.
Required outputs:
Forbidden in Phase 1:
Exit criteria: Hypothesis is concrete, testable, and you can point to why this is the cause — not just what looks broken.
Loop-back trigger: If during Phase 2 or 3 the hypothesis turns out wrong, return here. Do not patch on top of a broken hypothesis.
See references/phase-1-planner.md for hypothesis patterns and scope breakdown templates.
Goal: Apply exactly one focused change that tests the hypothesis from Phase 1.
Rules:
Definition of "one fix": A single logical change that either proves or disproves the hypothesis. Three unrelated improvements = three separate Planner → Coder → Validator cycles.
Exit criteria: Change is applied, diff contains only the intended work, nothing unrelated.
See references/phase-2-coder.md for scope discipline and loop-back triggers.
Goal: Verify the change fixed the root cause, not just the symptom.
Checklist (adapt to stack):
@ts-ignore / type: ignore needs an explicit written justification commentThe symptom-vs-cause test:
If I rolled back this change, would the symptom return because of the same cause, or because of something else?
If you can't answer confidently, the fix is symptomatic. Go back to Phase 1.
Exit criteria: All checks pass, root cause verified, no regressions.
Failure → Phase 4 Debugger.
See references/phase-3-validator.md for stack-agnostic validation patterns.
Goal: Bounded debugging with documentation. Escalation over thrashing.
Hard rules:
Attempt log template (write to .pipeline-state/attempts-<task>.md or inline in chat):
### Attempt N
- **Hypothesis**: What I now believe is wrong
- **Change**: What I modified (specific files/lines)
- **Result**: What happened (error output, unchanged behavior, new symptom)
- **Why it failed**: The actual root cause of this failure
- **Next direction**: What to try next OR escalate
After 3 failed attempts: STOP. Surface to the user with the full attempt log. Do not continue with a 4th attempt unless the user explicitly authorizes it.
Recovery trigger: If during Phase 4 a fundamentally new hypothesis emerges, return to Phase 1 — not Phase 2. A new hypothesis means a new cycle, not continued debugging.
See references/phase-4-debugger.md for escalation patterns and worked examples.
No phase transition is automatic. Each requires explicit criteria:
| From | To | Required |
|---|---|---|
| 1 → 2 | Planner → Coder | Hypothesis written + scope defined + success criteria |
| 2 → 3 | Coder → Validator | One focused change applied, no unrelated edits |
| 3 → Done | Validator passes | Build ✓ + types ✓ + root cause verified + no regressions |
| 3 → 4 | Validator → Debugger | Any validation check failed |
| 4 → 2 | Debugger → Coder | New hypothesis + change substantively different from previous attempts |
| 4 → 1 | Debugger → Planner | Fundamentally new hypothesis (not incremental) |
| 4 → STOP | Debugger → Escalate | 3 attempts exhausted |
| ANY → 1 | Back to Planner | Hypothesis proven wrong mid-cycle |
Activate this pipeline automatically when the task is:
Skip the pipeline only for:
What this pipeline prevents:
@ts-ignore without a written justification commentSee references/anti-patterns.md for concrete before/after examples of each.
This pipeline works well with — but does not replace — the following:
systematic-debugging — when Phase 4 escalates, hand off to systematic-debugging for the full investigation protocolself-improving-agent — after every failed attempt in Phase 4, log to .learnings/ERRORS.md so the next task starts with that knowledgeroot-cause-analysis — when Phase 3 root-cause verification is ambiguous, escalate to RCAtest-driven-development — Phase 1's "success criteria" naturally aligns with TDD's "write the failing test first"See references/integration.md for detailed pairing patterns.
Platform-specific activation and hook configuration lives in references/:
references/openclaw-integration.md — OpenClaw workspace setup, inter-session coordinationreferences/hooks-setup.md — Claude Code / Codex hook configuration (UserPromptSubmit)references/multi-agent.md — Claude Code, Codex CLI, GitHub Copilot activation patternsThis pipeline is based on a production coding standard used in NestJS/Next.js/PHP production systems. It emerged from repeated observation that AI agents, left to their defaults, retry-loop into incoherence on non-trivial work. The 4-phase structure + max-3-attempts rule + mandatory hypothesis is the minimum structure needed to keep agents disciplined.