Adaptive mesh refinement for plans — bidirectional verification that catches what single-pass planning misses
Verify plans by tracing them from both ends. Trace forward from the current state. Trace backward from the goal. Compare where they meet. If the gap is too big, pick a midpoint and repeat. Disagreements between the forward and backward traces are the findings.
Works for any plan: software, infrastructure, data pipelines, project plans, anything.
Read ~/.plancheck/projects/<hash>/knowledge.md if it exists (call get_last_check_id with cwd to confirm the project is known). Use it to:
If the file doesn't exist, proceed normally — it will be created after the first reflection.
When to run: The plan touches source files AND the check_plan MCP tool is available. Skip for non-code plans.
semanticSuggestions"semanticSuggestions": [
{"file": "routes.go", "confidence": 0.7, "reason": "registers the new handler"},
{"file": "types.go", "confidence": 0.5, "reason": "may need new types"}
]
check_plan with plan_json and cwdhistoryId — hold it for the reflection at the endprojectPatterns contains recurring-miss files, add them to the plan or explain why they're not neededpFailed > 0.4, the plan has high risk:
label is "novel" or "exploratory":
Use probe signals during verification:
.defn/ exists)If check_plan returns simulation data (the simulation field):
productionCallers — how many production definitions are in the blast radiustestCoverage — how many tests cover the modified definitionsconfidence — high/moderate/low based on test densityhighImpactDefs — definitions with high blast radius to focus verification onContext recovery: If context was compacted and you lost the historyId, call get_last_check_id with the project's cwd to recover it.
Start from the current project state. Walk through the first few steps of the plan.
For each step, state:
For steps that create new definitions (new functions, types, methods): generate a plausible Go function stub showing the signature and which existing functions it would call. If simulate_plan is available, call it with type: "addition" and the definition's name/receiver to see the structural impact. The stub doesn't need to compile — just show the structural relationships.
Stop when you're no longer confident about what state you're in. State explicitly: "Forward trace reaches [state] after step N."
Start from the completed goal. Work backward: "The goal is done. What were the last few steps that produced it?"
For each step (in reverse), state:
Stop when the required pre-state is no longer obvious from the goal alone. State explicitly: "Backward trace requires [state] before step M."
Compare where the forward trace stopped (state after step N) with where the backward trace needs to start (state before step M).
Three outcomes:
They agree. The states match. The gap between them is empty or obvious. Done.
They disagree but the gap is small. You can bridge it in a few confident steps. Write them. Done.
They disagree and the gap is large. You can't reliably trace the steps between them. Subdivide.
Pick the most important intermediate milestone between the two traces — where the plan's state is most constrained (a deployment, a migration, an API boundary, a test gate).
Trace forward from this midpoint and backward to it. Now you have two smaller gaps. Compare each one. If either is still too large, subdivide again.
The plan is verified when every adjacent trace agrees on the state between them. Each handoff is consistent: one trace's "after" matches the next trace's "before."
Bounds:
What disagreements reveal:
These disagreements ARE the findings.
Be terse. The tracing is internal work. The user sees:
[Traces]: forward (steps 1-N), backward (steps M-end), [K midpoints if any]
[Handoffs]: X states checked between traces
[Conflicts]: Y disagreements found
- [steps N-M] [disagreement] -> [fix]
[Simulation]: X production callers, Y tests, confidence Z (if available)
[Result]: Plan [verified | updated — re-checking]
After verification, present the final plan with all fixes applied. Note which changes came from trace disagreements vs deterministic probes.
When execution completes — all tasks finished, user confirms working, or failure hit:
First, call validate_execution with the original plan JSON and base_ref (the commit before execution started). This closes the prediction loop — plancheck compares its simulation predictions against the actual git diff and records calibration data for future forecasts.
Then, call record_reflection:
id: historyId from check_plan (if it ran), or omitcwd: project rootpasses: number of verification passes completed (count each forward+backward+compare as one pass; minimum 2)probe_findings: findings from deterministic probes (check_plan) that changed the planpersona_findings: findings from trace disagreements that changed the planmissed: what went wrong during execution that no pass caught (empty string if nothing)outcome: clean / rework / failed (your assessment)Call it automatically. Do not ask the user to classify the outcome.
After reflection, update the project's knowledge.md (stored in ~/.plancheck/projects/<hash>/). Keep it short (10-20 lines). Structure:
# Project: <name>
## What works
- <patterns that produce useful findings for this project>
## What doesn't
- <probes/signals that false-positive here>
## Always check
- <files that are always forgotten — from recurring-miss patterns>
## Risk areas
- <places where forward and backward traces tend to disagree>
If the file exists, update it incrementally. If it doesn't, create it.
| Condition | What degrades | What still works |
|---|---|---|
No check_plan MCP tool | No probes, no history | Verification passes run normally |
No simulate_plan MCP tool | No simulation data | check_plan still runs, comod still works |
No .defn/ directory | No reference graph, no simulation | Git comod still works |
| No git history | Co-mod empty, churn/signals empty | File existence checks still run |
| Greenfield project | File existence finds nothing | Validation signals still run |
PLANCHECK_NOHISTORY=1 set | No history, no patterns | All probes run stateless |
| Remote/headless | History may fail to write | Probes still return results |
| Cowork (multiple agents) | Each agent runs its own fork | History is append-only JSONL |
Verification passes require nothing except the plan text and the goal. Deterministic probes require the MCP tool + a project directory. Each layer fails independently.