Eng manager-mode plan review. Lock in the execution plan — architecture, data flow, diagrams, edge cases, test coverage, performance. Walks through issues interactively with opinionated recommendations. Use when asked to "review the architecture", "engineering review", or "lock in the plan". Proactively suggest when the user has a plan or design doc and is about to start coding — to catch architecture issues before implementation.
Adapted from gstack by Garry Tan (MIT License) for use with Perplexity Computer.
memory_search for architecture decisions, past eng reviews, and known pitfalls for this tech stackmemory_update with architecture decisions and test matrixIf the user asks you to compress or the system triggers context compaction: Step 0 > Test diagram > Opinionated recommendations > Everything else. Never skip Step 0 or the test diagram. Do not preemptively warn about context limits -- the system handles compaction automatically.
These are not additional checklist items. They are the instincts that experienced engineering leaders develop over years — the pattern recognition that separates "reviewed the code" from "caught the landmine." Apply them throughout your review.
When evaluating architecture, think "boring by default." When reviewing tests, think "systems over heroes." When assessing complexity, ask Brooks's question. When a plan introduces new infrastructure, check whether it's spending an innovation token wisely.
setopt +o nomatch 2>/dev/null || true # zsh compat
SLUG=$(browse/bin/remote-slug 2>/dev/null || basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)")
BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-' || echo 'no-branch')
DESIGN=$(ls -t (qstack memory) projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head -1)
[ -z "$DESIGN" ] && DESIGN=$(ls -t (qstack memory) projects/$SLUG/*-design-*.md 2>/dev/null | head -1)
[ -n "$DESIGN" ] && echo "Design doc found: $DESIGN" || echo "No design doc found"
If a design doc exists, read it. Use it as the source of truth for the problem statement, constraints, and chosen approach. If it has a Supersedes: field, note that this is a revised design — check the prior version for context on what changed and why.
When the design doc check above prints "No design doc found," offer the prerequisite skill before proceeding.
Say to the user via AskUserQuestion:
"No design doc found for this branch.
/office-hoursproduces a structured problem statement, premise challenge, and explored alternatives — it gives this review much sharper input to work with. Takes about 10 minutes. The design doc is per-feature, not per-product — it captures the thinking behind this specific change."
Options:
If they skip: "No worries — standard review. If you ever want sharper input, try /office-hours first next time." Then proceed normally. Do not re-offer later in the session.
If they choose A:
Say: "Running /office-hours inline. Once the design doc is ready, I'll pick up the review right where we left off."
Read the /office-hours skill file at office-hours/SKILL.md using the Read tool.
If unreadable: Skip with "Could not load /office-hours — skipping." and continue.
Follow its instructions from top to bottom, skipping these sections (already handled by the parent skill):
Execute every other section at full depth. When the loaded skill's instructions are complete, continue with the next step below.
After /office-hours completes, re-run the design doc check:
setopt +o nomatch 2>/dev/null || true # zsh compat
SLUG=$(browse/bin/remote-slug 2>/dev/null || basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)")
BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-' || echo 'no-branch')
DESIGN=$(ls -t (qstack memory) projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head -1)
[ -z "$DESIGN" ] && DESIGN=$(ls -t (qstack memory) projects/$SLUG/*-design-*.md 2>/dev/null | head -1)
[ -n "$DESIGN" ] && echo "Design doc found: $DESIGN" || echo "No design doc found"
If a design doc is now found, read it and continue the review. If none was produced (user may have cancelled), proceed with standard review.
Before reviewing anything, answer these questions:
What existing code already partially or fully solves each sub-problem? Can we capture outputs from existing flows rather than building parallel ones?
What is the minimum set of changes that achieves the stated goal? Flag any work that could be deferred without blocking the core objective. Be ruthless about scope creep.
Complexity check: If the plan touches more than 8 files or introduces more than 2 new classes/services, treat that as a smell and challenge whether the same goal can be achieved with fewer moving parts.
Search check: For each architectural pattern, infrastructure component, or concurrency approach the plan introduces:
If WebSearch is unavailable, skip this check and note: "Search unavailable — proceeding with in-distribution knowledge only."
If the plan rolls a custom solution where a built-in exists, flag it as a scope reduction opportunity. Annotate recommendations with [Layer 1], [Layer 2], [Layer 3], or [EUREKA] (see preamble's Search Before Building section). If you find a eureka moment — a reason the standard approach is wrong for this case — present it as an architectural insight.
TODOS cross-reference: Read TODOS.md if it exists. Are any deferred items blocking this plan? Can any deferred items be bundled into this PR without expanding scope? Does this plan create new work that should be captured as a TODO?
Completeness check: Is the plan doing the complete version or a shortcut? With AI-assisted coding, the cost of completeness (100% test coverage, full edge case handling, complete error paths) is 10-100x cheaper than with a human team. If the plan proposes a shortcut that saves human-hours but only saves minutes with CC+qstack, recommend the complete version. Boil the lake.
Distribution check: If the plan introduces a new artifact type (CLI binary, library package, container image, mobile app), does it include the build/publish pipeline? Code without distribution is code nobody can use. Check:
If the complexity check triggers (8+ files or 2+ new classes/services), proactively recommend scope reduction via AskUserQuestion — explain what's overbuilt, propose a minimal version that achieves the core goal, and ask whether to reduce or proceed as-is. If the complexity check does not trigger, present your Step 0 findings and proceed directly to Section 1.
Always work through the full interactive review: one section at a time (Architecture → Code Quality → Tests → Performance) with at most 8 top issues per section.
Critical: Once the user accepts or rejects a scope reduction recommendation, commit fully. Do not re-argue for smaller scope during later review sections. Do not silently reduce scope or skip planned components.
Anti-skip rule: Never condense, abbreviate, or skip any review section (1-4) regardless of plan type (strategy, spec, code, infra). Every section in this skill exists for a reason. "This is a strategy doc so implementation sections don't apply" is always wrong — implementation details are where strategy breaks down. If a section genuinely has zero findings, say "No issues found" and move on — but you must evaluate it.
Search for relevant learnings from previous sessions:
If CROSS_PROJECT is unset (first time): Use AskUserQuestion:
qstack can search learnings from your other projects on this machine to find patterns that might apply here. This stays local (no data leaves your machine). Recommended for solo developers. Skip if you work on multiple client codebases where cross-contamination would be a concern.
Options:
If A: run bin/memory_search/memory_update true
If B: run bin/memory_search/memory_update false
Then re-run the search with the appropriate flag.
If learnings are found, incorporate them into your analysis. When a review finding matches a past learning, display:
"Prior learning applied: [key] (confidence N/10, from [date])"
This makes the compounding visible. The user should see that qstack is getting smarter on their codebase over time.
Evaluate:
STOP. For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved.
Every finding MUST include a confidence score (1-10):
| Score | Meaning | Display rule |
|---|---|---|
| 9-10 | Verified by reading specific code. Concrete bug or exploit demonstrated. | Show normally |
| 7-8 | High confidence pattern match. Very likely correct. | Show normally |
| 5-6 | Moderate. Could be a false positive. | Show with caveat: "Medium confidence, verify this is actually an issue" |
| 3-4 | Low confidence. Pattern is suspicious but may be fine. | Suppress from main report. Include in appendix only. |
| 1-2 | Speculation. | Only report if severity would be P0. |
Finding format:
`[SEVERITY] (confidence: N/10) file:line — description`
Example: `[P1] (confidence: 9/10) app/models/user.rb:42 — SQL injection via string interpolation in where clause` `[P2] (confidence: 5/10) app/controllers/api/v1/users_controller.rb:18 — Possible N+1 query, verify with production logs`
Calibration learning: If you report a finding with confidence < 7 and the user confirms it IS a real issue, that is a calibration event. Your initial confidence was too low. Log the corrected pattern as a learning so future reviews catch it with higher confidence.
Evaluate:
STOP. For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved.
100% coverage is the goal. Evaluate every codepath in the plan and ensure the plan includes tests for each one. If the plan is missing tests, add them — the plan should be complete enough that implementation includes full test coverage from the start.
Before analyzing coverage, detect the project's test framework:
## Testing section with test command and framework name. If found, use that as the authoritative source.setopt +o nomatch 2>/dev/null || true # zsh compat
# Detect project runtime
[ -f Gemfile ] && echo "RUNTIME:ruby"
[ -f package.json ] && echo "RUNTIME:node"
[ -f requirements.txt ] || [ -f pyproject.toml ] && echo "RUNTIME:python"
[ -f go.mod ] && echo "RUNTIME:go"
[ -f Cargo.toml ] && echo "RUNTIME:rust"
# Check for existing test infrastructure
ls jest.config.* vitest.config.* playwright.config.* cypress.config.* .rspec pytest.ini phpunit.xml 2>/dev/null
ls -d test/ tests/ spec/ __tests__/ cypress/ e2e/ 2>/dev/null
Step 1. Trace every codepath in the plan:
Read the plan document. For each new feature, service, endpoint, or component described, trace how data will flow through the code — don't just list planned functions, actually follow the planned execution:
This is the critical step — you're building a map of every line of code that can execute differently based on input. Every branch in this diagram needs a test.
Step 2. Map user flows, interactions, and error states:
Code coverage isn't enough — you need to cover how real users interact with the changed code. For each changed feature, think through:
Add these to your diagram alongside the code branches. A user flow with no test is just as much a gap as an untested if/else.
Step 3. Check each branch against existing tests:
Go through your diagram branch by branch — both code paths AND user flows. For each one, search for a test that exercises it:
processPayment() → look for billing.test.ts, billing.spec.ts, test/billing_test.rbhelperFn() that has its own branches → those branches need tests tooQuality scoring rubric:
When checking each branch, also determine whether a unit test or E2E/integration test is the right tool:
RECOMMEND E2E (mark as [→E2E] in the diagram):
RECOMMEND EVAL (mark as [→EVAL] in the diagram):
STICK WITH UNIT TESTS:
IRON RULE: When the coverage audit identifies a REGRESSION — code that previously worked but the diff broke — a regression test is added to the plan as a critical requirement. No AskUserQuestion. No skipping. Regressions are the highest-priority test because they prove something broke.
A regression is when:
When uncertain whether a change is a regression, err on the side of writing the test.
Step 4. Output ASCII coverage diagram:
Include BOTH code paths and user flows in the same diagram. Mark E2E-worthy and eval-worthy paths:
CODE PATH COVERAGE
===========================
[+] src/services/billing.ts
│
├── processPayment()
│ ├── [★★★ TESTED] Happy path + card declined + timeout — billing.test.ts:42
│ ├── [GAP] Network timeout — NO TEST
│ └── [GAP] Invalid currency — NO TEST
│
└── refundPayment()
├── [★★ TESTED] Full refund — billing.test.ts:89
└── [★ TESTED] Partial refund (checks non-throw only) — billing.test.ts:101
USER FLOW COVERAGE
===========================
[+] Payment checkout flow
│
├── [★★★ TESTED] Complete purchase — checkout.e2e.ts:15
├── [GAP] [→E2E] Double-click submit — needs E2E, not just unit
├── [GAP] Navigate away during payment — unit test sufficient
└── [★ TESTED] Form validation errors (checks render only) — checkout.test.ts:40
[+] Error states
│
├── [★★ TESTED] Card declined message — billing.test.ts:58
├── [GAP] Network timeout UX (what does user see?) — NO TEST
└── [GAP] Empty cart submission — NO TEST
[+] LLM integration
│
└── [GAP] [→EVAL] Prompt template change — needs eval test
─────────────────────────────────
COVERAGE: 5/13 paths tested (38%)
Code paths: 3/5 (60%)
User flows: 2/8 (25%)
QUALITY: ★★★: 2 ★★: 2 ★: 1
GAPS: 8 paths need tests (2 need E2E, 1 needs eval)
─────────────────────────────────
Fast path: All paths covered → "Test review: All new code paths have test coverage ✓" Continue.
Step 5. Add missing tests to the plan:
For each GAP identified in the diagram, add a test requirement to the plan. Be specific:
The plan should be complete enough that when implementation begins, every test is written alongside the feature code — not deferred to a follow-up.
After producing the coverage diagram, write a test plan artifact to the project directory so /qa and /qa-only can consume it as primary test input:
Write to (qstack memory) projects/{slug}/{user}-{branch}-eng-review-test-plan-{datetime}.md:
# Test Plan
Generated by /plan-eng-review on {date}
Branch: {branch}
Repo: {owner/repo}
## Affected Pages/Routes
- {URL path} — {what to test and why}
## Key Interactions to Verify
- {interaction description} on {page}
## Edge Cases
- {edge case} on {page}
## Critical Paths
- {end-to-end flow that must work}
This file is consumed by /qa and /qa-only as primary test input. Include only the information that helps a QA tester know what to test and where — not implementation details.
For LLM/prompt changes: check the "Prompt/LLM changes" file patterns listed in project custom instructions. If this plan touches ANY of those patterns, state which eval suites must be run, which cases should be added, and what baselines to compare against. Then use AskUserQuestion to confirm the eval scope with the user.
STOP. For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved.
Evaluate:
STOP. For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved.
After all review sections are complete, offer an independent second opinion from a different AI system. Two models agreeing on a plan is stronger signal than one model's thorough review.
Check tool availability:
which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
Use AskUserQuestion:
"All review sections are complete. Want an outside voice? A different AI system can give a brutally honest, independent challenge of this plan — logical gaps, feasibility risks, and blind spots that are hard to catch from inside the review. Takes about 2 minutes."
RECOMMENDATION: Choose A — an independent second opinion catches structural blind spots. Two different AI models agreeing on a plan is stronger signal than one model's thorough review. Completeness: A=9/10, B=7/10.
Options:
If B: Print "Skipping outside voice." and continue to the next section.
If A: Construct the plan review prompt. Read the plan file being reviewed (the file the user pointed this review at, or the branch diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains the scope decisions and vision.
Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB, truncate to the first 30KB and note "Plan truncated for size"). Always start with the filesystem boundary instruction:
"IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, , or agents/. These are Perplexity Computer skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nYou are a brutally honest technical reviewer examining a development plan that has already been through a multi-section review. Your job is NOT to repeat that review. Instead, find what it missed. Look for: logical gaps and unstated assumptions that survived the review scrutiny, overcomplexity (is there a fundamentally simpler approach the review was too deep in the weeds to see?), feasibility risks the review took for granted, missing dependencies or sequencing issues, and strategic miscalibration (is this the right thing to build at all?). Be direct. Be terse. No compliments. Just the problems.
THE PLAN: <plan content>"
If CODEX_AVAILABLE:
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR_PV"
Use a 5-minute timeout (timeout: 300000). After the command completes, read stderr:
cat "$TMPERR_PV"
Present the full output verbatim:
CODEX SAYS (plan review — outside voice):
════════════════════════════════════════════════════════════
<full codex output, verbatim — do not truncate or summarize>
════════════════════════════════════════════════════════════
Error handling: All errors are non-blocking — the outside voice is informational.
On any Codex error, fall back to the Claude adversarial subagent.
If CODEX_NOT_AVAILABLE (or Codex errored):
dispatch via run_subagent. The subagent has fresh context — genuine independence.
Subagent prompt: same plan review prompt as above.
Present findings under an OUTSIDE VOICE (Claude subagent): header.
If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs."
Cross-model tension:
After presenting the outside voice findings, note any points where the outside voice disagrees with the review findings from earlier sections. Flag these as:
CROSS-MODEL TENSION:
[Topic]: Review said X. Outside voice says Y. [Present both perspectives neutrally.
State what context you might be missing that would change the answer.]
User Sovereignty: Do NOT auto-incorporate outside voice recommendations into the plan. Present each tension point to the user. The user decides. Cross-model agreement is a strong signal — present it as such — but it is NOT permission to act. You may state which argument you find more compelling, but you MUST NOT apply the change without explicit user approval.
For each substantive tension point, use AskUserQuestion:
"Cross-model disagreement on [topic]. The review found [X] but the outside voice argues [Y]. [One sentence on what context you might be missing.]"
RECOMMENDATION: Choose [A or B] because [one-line reason explaining which argument is more compelling and why]. Completeness: A=X/10, B=Y/10.
Options:
Wait for the user's response. Do NOT default to accepting because you agree with the outside voice. If the user chooses B, the current approach stands — do not re-argue.
If no tension points exist, note: "No cross-model tension — both reviewers agree."
Persist the result:
bin/memory_update '{"skill":"codex-plan-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}'
Substitute: STATUS = "clean" if no findings, "issues_found" if findings exist. SOURCE = "codex" if Codex ran, "claude" if subagent ran.
Cleanup: Run rm -f "$TMPERR_PV" after processing (if Codex was used).
Outside voice findings are INFORMATIONAL until the user explicitly approves each one. Do NOT incorporate outside voice recommendations into the plan without presenting each finding via AskUserQuestion and getting explicit approval. This applies even when you agree with the outside voice. Cross-model consensus is a strong signal — present it as such — but the user makes the decision.
Follow the AskUserQuestion format from the Preamble above. Additional rules for plan reviews:
Every plan review MUST produce a "NOT in scope" section listing work that was considered and explicitly deferred, with a one-line rationale for each item.
List existing code/flows that already partially solve sub-problems in this plan, and whether the plan reuses them or unnecessarily rebuilds them.
After all review sections are complete, present each potential TODO as its own individual AskUserQuestion. Never batch TODOs — one per question. Never silently skip this step. Follow the format in review/TODOS-format.md.
For each TODO, describe:
Then present options: A) Add to TODOS.md B) Skip — not valuable enough C) Build it now in this PR instead of deferring.
Do NOT just append vague bullet points. A TODO without context is worse than no TODO — it creates false confidence that the idea was captured while actually losing the reasoning.
The plan itself should use ASCII diagrams for any non-trivial data flow, state machine, or processing pipeline. Additionally, identify which files in the implementation should get inline ASCII diagram comments — particularly Models with complex state transitions, Services with multi-step pipelines, and Concerns with non-obvious mixin behavior.
For each new codepath identified in the test review diagram, list one realistic way it could fail in production (timeout, nil reference, race condition, stale data, etc.) and whether:
If any failure mode has no test AND no error handling AND would be silent, flag it as a critical gap.
Analyze the plan's implementation steps for parallel execution opportunities. This helps the user split work across git worktrees (via Perplexity Computer's Agent tool with isolation: "worktree" or parallel workspaces).
Skip if: all steps touch the same primary module, or the plan has fewer than 2 independent workstreams. In that case, write: "Sequential implementation, no parallelization opportunity."
Otherwise, produce:
| Step | Modules touched | Depends on |
|---|---|---|
| (step name) | (directories/modules, NOT specific files) | (other steps, or —) |
Work at the module/directory level, not file level. Plans describe intent ("add API endpoints"), not specific files. Module-level ("controllers/, models/") is reliable; file-level is guesswork.
Format: Lane A: step1 → step2 (sequential, shared models/) / Lane B: step3 (independent)
Execution order — which lanes launch in parallel, which wait. Example: "Launch A + B in parallel worktrees. Merge both. Then C."
Conflict flags — if two parallel lanes touch the same module directory, flag it: "Lanes X and Y both touch module/ — potential merge conflict. Consider sequential execution or careful coordination."
At the end of the review, fill in and display this summary so the user can see all findings at a glance:
Check the git log for this branch. If there are prior commits suggesting a previous review cycle (e.g., review-driven refactors, reverted changes), note what was changed and whether the current plan touches the same areas. Be more aggressive reviewing areas that were previously problematic.
After producing the Completion Summary above, persist the review result.
PLAN MODE EXCEPTION — ALWAYS RUN: This command writes review metadata to
(qstack memory) (user config directory, not project files). The skill preamble
already writes to (qstack memory) sessions/ and (qstack memory) analytics/ — this is
the same pattern. The review dashboard depends on this data. Skipping this
command breaks the review readiness dashboard in /ship.
bin/memory_update '{"skill":"plan-eng-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"issues_found":N,"mode":"MODE","commit":"COMMIT"}'
Substitute values from the Completion Summary:
git rev-parse --short HEADAfter completing the review, read the review log and config to display the dashboard.
bin/memory_search
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between review (diff-scoped pre-landing review) and plan-eng-review (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between adversarial-review (new auto-scaled) and codex-review (legacy). For Design Review, show whichever is more recent between plan-design-review (full visual audit) and design-review-lite (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent codex-plan-review entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review.
Source attribution: If the most recent entry for a skill has a `"via"` field, append it to the status label in parentheses. Examples: plan-eng-review with via:"autoplan" shows as "CLEAR (PLAN via /autoplan)". review with via:"ship" shows as "CLEAR (DIFF via /ship)". Entries without a via field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before.
Note: autoplan-voices and design-outside-voices entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer.
Display:
+====================================================================+
| REVIEW READINESS DASHBOARD |
+====================================================================+
| Review | Runs | Last Run | Status | Required |
|-----------------|------|---------------------|-----------|----------|
| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES |
| CEO Review | 0 | — | — | no |
| Design Review | 0 | — | — | no |
| Adversarial | 0 | — | — | no |
| Outside Voice | 0 | — | — | no |
+--------------------------------------------------------------------+
| VERDICT: CLEARED — Eng Review passed |
+====================================================================+
Review tiers:
Verdict logic:
Staleness detection: After displaying the dashboard, check if any existing reviews may be stale:
After displaying the Review Readiness Dashboard in conversation output, also update the plan file itself so review status is visible to anyone reading the plan.
Read the review log output you already have from the Review Readiness Dashboard step above. Parse each JSONL entry. Each skill logs different fields:
All fields needed for the Findings column are now present in the JSONL entries. For the review you just completed, you may use richer details from your own Completion Summary. For prior reviews, use the JSONL fields directly — they contain all required data.
Produce this markdown table:
```markdown
If you discovered a non-obvious pattern, pitfall, or architectural insight during this session, log it for future sessions:
Types: pattern (reusable approach), pitfall (what NOT to do), preference
(user stated), architecture (structural decision), tool (library/framework insight),
operational (project environment/CLI/workflow knowledge).
Sources: observed (you found this in the code), user-stated (user told you),
inferred (AI deduction), cross-model (both Claude and Codex agree).
Confidence: 1-10. Be honest. An observed pattern you verified in the code is 8-9. An inference you're not sure about is 4-5. A user preference they explicitly stated is 10.
files: Include the specific file paths this learning references. This enables staleness detection: if those files are later deleted, the learning can be flagged.
Only log genuine discoveries. Don't log obvious things. Don't log things the user already knows. A good test: would this insight save time in a future session? If yes, log it.
After displaying the Review Readiness Dashboard, check if additional reviews would be valuable. Read the dashboard output to see which reviews have already been run and whether they are stale.
Suggest /plan-design-review if UI changes exist and no design review has been run — detect from the test diagram, architecture review, or any section that touched frontend components, CSS, views, or user-facing interaction flows. If an existing design review's commit hash shows it predates significant changes found in this eng review, note that it may be stale.
Mention /plan-ceo-review if this is a significant product change and no CEO review exists — this is a soft suggestion, not a push. CEO review is optional. Only mention it if the plan introduces new user-facing features, changes product direction, or expands scope substantially.
Note staleness of existing CEO or design reviews if this eng review found assumptions that contradict them, or if the commit hash shows significant drift.
If no additional reviews are needed (or skip_eng_review is true in the dashboard config, meaning this eng review was optional): state "All relevant reviews complete. Run /ship when ready."
Use AskUserQuestion with only the applicable options:
If the user does not respond to an AskUserQuestion or interrupts to move on, note which decisions were left unresolved. At the end of the review, list these as "Unresolved decisions that may bite you later" — never silently default to an option.
After eng review locks the architecture, start building — or run design/DX review if the feature has UI or developer-facing components.
Feeds from: office-hours, plan-ceo-review
Next steps: plan-design-review, plan-devex-review