Run the full diagnostic pipeline on a foreign system. Intake → Diagnose → Prescribe, with mandatory human gates between each step. Skipping a step is not allowed.
Run Intake → Diagnose → Prescribe on a foreign system. Each step produces a file. Each transition requires human judgment. The pipeline does not skip steps.
Run /intake <target>. Intake asks development stage first, then reads sources, translates vocabulary, fans out on thin coverage, files claims, and elicits human judgment on ambiguous mappings. Development stage and constraints recorded in S.md metadata propagate automatically to Diagnose and Prescribe — they are not re-elicited at each gate.
Gate: Intake includes its own elicitation. S.md is not complete until the human has answered the mapping questions. Do not proceed to Step 2 until S.md has an ## elicitation section with oracle reliability ≥ 50%.
Verify before advancing:
stage: (development stage)Run /diagnose. Diagnose reads S.md, builds the tower table, draws architecture SVGs, substantiates mappings against code, and identifies gaps. Produces O.md.
Gate: Diagnose stops after writing O.md and presents checkpoint questions to the human. The human's answers determine the causal chain direction, distinguish blind spots from design choices, and identify anything missing from the tower table.
Verify before advancing:
Diagnose writes A.md incorporating the human's answers from Step 2. Role-by-role assessment, causal chain, gap summary.
Gate: Present A.md to the human. Ask: "Does this assessment match your understanding? Is the root cause correctly identified?"
Verify before advancing:
Run /prescribe. Prescribe reads A.md, consults the Parts Bin, evaluates candidates, and triages by urgency (Critical / Structural / Rehabilitative).
Gate: Present P.md to the human. Ask: "Does the triage order match the system's actual constraints? Is the root intervention correct, or does it address a symptom?"
Verify before advancing:
SOAP complete. Four files in soap/: S.md, O.md, A.md, P.md. Each one traces back to the previous. Every human gate is recorded in the files (elicitation in S.md, checkpoint answers in A.md, triage confirmation in P.md).
No skipping. Step N does not start until Step N-1 is verified. If the human is unavailable, the pipeline stops at the current gate.
No backfilling. If Step 2 reveals that S.md is missing something, go back to Step 1 and update S.md. Do not patch O.md to compensate.
One question at a time at every gate. Don't dump the full checkpoint quiz.
Record everything. Every human answer goes into the relevant file. A reader should reconstruct the full decision trail from the SOAP files alone.
Log every skill run. Append a one-liner to soap/surprises.md when each skill completes, even if nothing surprising happened. This grounds the surprises in temporal order.
## Run log
- 2026-04-01 14:00 — /intake on Soar: 90 claims, 6 sources, 7 ambiguous mappings
- 2026-04-01 14:30 — /diagnose on Soar: 5 gaps across 3 stack levels, root cause found
- 2026-04-01 15:00 — /prescribe on Soar: 5 prescriptions, 3 triage tiers
Log surprises. Append to soap/surprises.md whenever anything unexpected happens: a skill spec that needed changing, a mapping that broke assumptions, a codex finding that contradicted the pipeline's output, a human answer that redirected the diagnosis. Format:
### [step] [timestamp] — [one-line summary]
**Expected:** what the skill/pipeline predicted
**Found:** what actually happened
**Action:** what changed (skill spec update, backtrack, new open question)
Surprises are the I-frames for a future consolidation pass. The run log provides the P-frames between them. Together they reconstruct the full episode.
Each skill converges individually (self-check loop, hard stop at 10 passes). Codex sniffs before every human gate, fixing obvious issues so the human only Attends ambiguities. If codex is unavailable, try Gemini as the reviewer. If neither is available, the agent performs a self-review pass applying the same criteria (framework leaks, weak provenance, miscalibrated confidence). Log which reviewer was used or skipped so the human knows the Filter strength. The composed pipeline converges because:
Running /soap twice on the same system with the same human answers produces the same four files.