Use when an implementation plan or task list is already written and needs to be executed step by step with review gates between tasks (/planning, execute this plan, run the plan, or "work through these tasks in order"). Do not use to create the plan or to run a broader autonomous harness loop.
Arguments: $ARGUMENTS
Sequential plan executor: read plan, implement each task via child context, verify with /critique gates, repeat until complete.
Plan creation: native plan mode or /critique for plan review. This skill handles structured execution.
| Mode | When | Planning does | Harness does |
|---|---|---|---|
| Standalone | User or /caffeine calls directly | Drives task loop + runs /critique gates + owns ledger | N/A |
| Embedded | /harness delegates during run phase | Parses plan + dispatches implementer per task | Drives round loop, runs verification gates, owns state + final review |
Standalone: planning owns full lifecycle — task sequencing, implementer dispatch, /critique review gates, ledger.
Embedded: planning is plan adapter + implementer dispatcher only. Harness drives one round per task:
harness round N (for task N):
propose: planning dispatches implementer
verify: harness runs verification gate (/critique)
evaluate: harness decides keep/discard
record: harness writes state.jsonl
Each task carries an implementation protocol passed to implementer:
tdd_required -- default for features, bug fixes, refactors, behavior changestdd_preferred -- test-first when practical, justified fallback alloweddirect -- config-only, docs-only, generated code, or similar non-TDD workDefault tdd_required when unsure. For tdd_required tasks, implementer must provide RED/GREEN evidence.
When task involves mocks, test doubles, or test-only seams, read references/testing-anti-patterns.md and fold guidance into implementer prompt.
/critique./fanout or /critique./critique gates. Planning does not invoke /critique.In embedded mode, harness drives round loop and runs verification gates; planning only provides implementer dispatch.
digraph process {
rankdir=TB;
subgraph cluster_per_task {
label="Per Task";
"Launch implementer (./implementer-prompt.md)" [shape=box];
"Questions?" [shape=diamond];
"Answer, re-launch" [shape=box];
"Implementer implements, tests, commits" [shape=box];
"/critique" [shape=box];
"Review pass?" [shape=diamond];
"Fix issues" [shape=box];
"Mark task complete" [shape=box];
}
"Normalize input into task manifest" [shape=box];
"More tasks?" [shape=diamond];
"Done" [shape=box style=filled fillcolor=lightgreen];
"Normalize input into task manifest" -> "Launch implementer (./implementer-prompt.md)";
"Launch implementer (./implementer-prompt.md)" -> "Questions?";
"Questions?" -> "Answer, re-launch" [label="yes"];
"Answer, re-launch" -> "Launch implementer (./implementer-prompt.md)";
"Questions?" -> "Implementer implements, tests, commits" [label="no"];
"Implementer implements, tests, commits" -> "/critique";
"/critique" -> "Review pass?";
"Review pass?" -> "Fix issues" [label="fail"];
"Fix issues" -> "/critique" [label="re-review"];
"Review pass?" -> "Mark task complete" [label="pass"];
"Mark task complete" -> "More tasks?";
"More tasks?" -> "Launch implementer (./implementer-prompt.md)" [label="yes"];
"More tasks?" -> "Done" [label="no"];
}
Accepts tasks from multiple sources:
All inputs normalized into canonical task manifest: