When the team needs to implement a major feature that spans multiple agents, multiple files, and multiple phases, this skill defines how to write the plan, review it, resolve blockers, and execute it autonomously ("walk away" capable). This was earned during the NPC+Combat implementation plan — the first plan to go through full team review, blocker resolution, and wave-based autonomous execution.
Applies when:
A feature touches 3+ agents' domains
Implementation spans multiple files that must coordinate
Wayne wants to "start it and walk away"
Both design plans and implementation code are involved
Patterns
1. Resolve All Open Questions BEFORE Writing the Plan
Before the architect writes the plan, the coordinator must:
Extract ALL open questions from every source plan (NPC plan, combat plan, etc.)
Categorize: which BLOCK implementation vs. which have clear recommendations
Present blocking questions to Wayne ONE AT A TIME using with choices
Related Skills
ask_user
Batch-approve non-blocking questions that have recommendations
Capture all decisions to .squad/decisions/inbox/
Result: ZERO open blockers before the plan is written
2. Plan Structure (11+ Sections)
The implementation plan document (plans/{feature}-implementation-plan.md) must contain:
Section
Purpose
Executive Summary
What we're building, in what order, why
Quick Reference Table
All waves + gates at a glance
Dependency Graph
ASCII/markdown showing what blocks what
Implementation Waves
Parallel batches of work (WAVE-0, WAVE-1, ...)
Testing Gates
Binary pass/fail checkpoints between waves
Feature Breakdown (per system)
Detailed per-module specs
Cross-System Integration Points
Where systems connect
Nelson LLM Test Scenarios
Specific headless walkthrough scripts
TDD Test File Map
Every test file listed with what it covers
Risk Register
What could go wrong + mitigations
Autonomous Execution Protocol
How coordinator runs without Wayne
Gate Failure Protocol
Escalation rules (1x → issue, 2x → Wayne)
Wave Checkpoint Protocol
After each wave: verify completion, update plan
Documentation Deliverables
Brockman's docs listed per gate
2a. Chunked Plan Writing (CRITICAL — Do NOT Write All at Once)
Problem: Full implementation plans are 50-80KB+ documents (1000+ lines). Writing the entire plan in a single agent call causes timeouts, connection drops, and context exhaustion. Phase 2's first attempt crashed opus after 43 minutes.
Solution: The coordinator breaks the plan into chunks and assigns them sequentially or in parallel to the architect. Each chunk writes to its own temp file, then a final assembly step merges them.
Chunk 1 runs first (sync) — establishes wave count and structure
Chunks 2-5 can run in parallel (all reference Chunk 1's skeleton)
Final assembly: coordinator or architect concatenates chunks into the plan file
Git commit the assembled plan
Coordinator prompt pattern for each chunk:
Write ONLY {sections} for the Phase 2 plan.
The skeleton (Chunk 1) established {N} waves: {wave names}.
Write to: plans/{feature}-phase2-chunk-{N}.md
Do NOT write the full plan — just these sections.
Fallback: If the plan is small enough (<30KB estimated), a single agent call is acceptable. Use chunking for any plan expected to exceed 40KB.
Each wave is a batch of parallel work — all agents in a wave start simultaneously
Hard rule: No two agents in the same wave touch the same file
Multiple instances OK: The same team member CAN be spawned as multiple parallel instances in the same wave IF they're working on different files. E.g., two Nelson instances writing tests for different modules, or two Flanders instances building different objects. The coordinator labels them clearly (e.g., "Nelson (creature tests)" vs "Nelson (material tests)"). The only constraint is file-level: no two instances touch the same file.
Each wave has explicit: agent assignments, exact file paths, TDD requirements, scope estimate
Commit/push after every gate passes
Checkpoint after every wave: verify completion, update plan documentation
4. Gate Design Rules
Gates are binary pass/fail — no "mostly works"
Every gate specifies: unit tests that must pass, zero regressions, LLM walkthrough scenarios
Escalate to Wayne after 1x gate failure (Phase 1 threshold — relax to 2x once proven)
Commit/push after every gate
Checkpoint plan doc after every wave (mark completed, note deviations)
Ralph monitors the pipeline if activated
10. Wayne's Decision Capture Pattern
When resolving blocking questions:
Use ask_user with choices (not open-ended)
One question at a time
Include recommendation as first choice with "(Recommended)" label
Capture each decision immediately to .squad/decisions/inbox/
Batch non-blocking questions: "approve all at recommendation?"
11. Nelson Continuous LLM Testing (Walk-Away Assurance)
For long-running autonomous execution without human intervention, Nelson runs LLM playthroughs continuously, not just at gates:
After every wave: Nelson runs a quick smoke-test walkthrough (~5 commands) to verify the game still boots and basic interaction works. This catches regressions BEFORE the gate.
At every gate: Nelson runs the full scenario suite (all scenarios defined for that gate). This is the formal pass/fail.
Between waves (if idle): Nelson runs exploratory LLM sessions — freeform play in --headless mode, trying edge cases, unusual verb combinations, dark-room interactions. Findings logged as issues.
Frequency: Coordinator decides when to spawn Nelson instances based on wave complexity. Simple data waves (WAVE-1, WAVE-4) get post-wave smoke only. Engine waves (WAVE-2, WAVE-5) get mid-wave checks too.
All runs use --headless mode with deterministic seeds for reproducibility.
Nelson instances can run in parallel with implementation agents (different files — tests vs engine code).
12. Game Design Review at Gates (CBG)
Beyond code tests, major gates include a player experience check:
CBG reviews: does it FEEL right? Is the gameplay arc discoverable? Does pacing work?
"Subjective pass/fail" scenarios: e.g., "light candle → examine room → pick up item should feel natural in <3 commands"
Design debt captured to .squad/decisions/inbox/cbg-design-debt-WAVE-N.md — doesn't block gates but feeds polish phase
13. Architecture Safeguards (Bart)
Interface contracts: Each wave documents what public APIs it exposes for the next wave. Contracts freeze once dependent wave starts.
Module size guard: If any module exceeds 500 LOC mid-plan, trigger engine-code-review skill before it ships.
Rollback strategy: Git tag per gate. If wave N+2 reveals wave N was wrong, revert to tag, re-plan.
Cross-cutting checklist: Before each gate verify: consistent error handling, debug hooks present, performance baseline measured.
14. Plan Lifecycle (Chalmers)
Version tracking: Plan increments version on each review fix pass (v1.0 → v1.1). Reviewers reference version.
Status tracker: Top of plan doc shows wave status at a glance: WAVE-0: ✅ | WAVE-1: 🟡 | WAVE-2: ⏳
Session continuity: If session dies mid-wave, next session checks plan status tracker, resumes from last completed wave.
Post-mortem: After all waves, add "Lessons" section: actual vs estimated, gate failures, new risks, candidate skills.
Cross-reference analysis that preceded the plan: 13 alignment fixes identified and applied by CBG.
Anti-Patterns
Don't write the plan with open questions — resolve them ALL first
Don't write the entire plan in one agent call — chunk it (Pattern 2a). Phase 2 opus crashed at 43 min writing 70KB+ monolithically. Sonnet retry took 20+ min. Chunk into 5 pieces, assemble after.
Don't skip team review — every reviewer catches different things (CBG found hybrid stance gap, Marge found 4 test blockers, Wayne found docs gap, Chalmers found player file ambiguity)
Don't serialize the review — all reviewers read in parallel
Don't treat docs as optional — Wayne considers missing docs a blocker