Self-learning orchestrator with unified insight memory. Explore, plan, build.
HELIX="$(cat .helix/plugin_root)"
This file (created by SessionStart hook) contains the plugin root path with lib/, agents/ subdirectories.
Phases: RECALL → EXPLORE → PLAN → BUILD (loop with stall recovery) → LEARN → COMPLETE
Fast path: If the objective is a single-file change with obvious scope (rename, config tweak, small fix), skip EXPLORE/PLAN. Spawn one builder directly with the objective as its task. LEARN phase still applies.
Goal: Bring accumulated knowledge to bear on orchestration decisions. Exit when: Synthesis blocks ready (empty blocks omitted).
python3 "$HELIX/lib/injection.py" strategic-recall "{objective_summary}"
Parse JSON. Use summary for triage, synthesize insights into blocks:
_effectiveness >= 0.70): decomposition rules, verification needs, sequencing._effectiveness < 0.40) or derived/failure tags: flag for extra verification, smaller tasks._hop: 1 insights (graph-adjacent, not direct match). Treat as exploration targets.summary.graph (when graph_too_small is false):
Weight by relevance: An insight with _effectiveness: 0.85 but _relevance: 0.36 (barely above threshold) is weakly connected to this objective — treat as background context, not hard constraint. High-effectiveness + high-relevance = strong constraint.
Triage signals: coverage_ratio > 0.3 = well-mapped, trust constraints. < 0.1 = uncharted, expand exploration. graph_expanded_count > 0 = graph surfacing related context.
Example:
CONSTRAINTS:
- Keep auth middleware changes atomic (historically blocks when split) [82%]
- Plan explicit mock setup task before OAuth integration tests [75%]
RISK_AREAS:
- Payments module has blocked 3 of 4 attempts — use smaller tasks [35%]
EXPLORATION_TARGETS:
- config/secrets.py (referenced by auth insights but not in objective)
- tests/fixtures/ (multiple insights reference test setup patterns)
Persist synthesis (survives context compression):
cat > .helix/recall_synthesis.json << 'RECALL_EOF'
{
"objective": "{objective_summary}",
"constraints": [{insight_content_and_effectiveness}],
"risk_areas": [{insight_content_and_effectiveness}],
"exploration_targets": ["{paths}"],
"graph_discovered": [{hop_1_insights}],
"triage": {"coverage_ratio": {n}, "well_mapped": {bool}, "graph_expanded": {count}}
}
RECALL_EOF
If you re-read .helix/recall_synthesis.json mid-BUILD, context was compressed — this file preserves your orchestration decisions.
Targeted follow-up: If blind spots identified, call python3 "$HELIX/lib/memory/core.py" recall "{specific_area}" --limit 3.
If empty: omit blocks, no degradation. Fast path: skip RECALL for single-file changes.
Goal: Map codebase landscape, leveraging recalled insights.
Exit when: Partitioned findings cover files relevant to objective.
Greenfield: If git ls-files | wc -l returns 0 or only config files, skip to PLAN with EXPLORATION: {}.
git ls-files | head -80 — identify 3-6 natural partitions.subagent_type="helix:helix-explorer", model=sonnet, max_turns=30. Prompt: CONTEXT:{relevant_insights}\nSCOPE: {partition}\nFOCUS: {focus}\nOBJECTIVE: {objective}. All explorers in ONE message — no run_in_background.Goal: Decompose objective into executable task DAG. Exit when: Tasks created with valid dependencies and no cycles.
subagent_type="helix:helix-planner", max_turns=500. Prompt: OBJECTIVE: {objective}\nEXPLORATION: {findings_json}\nCONSTRAINTS: {constraints_from_recall}\nRISK_AREAS: {risk_areas_from_recall}. Omit empty blocks.TaskCreate(subject="{seq}: {slug}", description=..., activeForm="Building {slug}", metadata={"seq": "{seq}", "relevant_files": [...]}). Track seq_to_id[spec.seq] = task_id.TaskUpdate(taskId=seq_to_id[spec.seq], addBlockedBy=[seq_to_id[b], ...]).python3 "$HELIX/lib/build_loop.py" detect-cycles --dependencies '$DEPS_JSON'. Confirm relevant_files reference exploration paths.If PLAN_SPEC empty or ERROR -- add exploration context, re-run planner.
Context recovery: If context was compressed, re-read .helix/recall_synthesis.json for prior CONSTRAINTS and RISK_AREAS before proceeding.
Goal: Execute all tasks. Exit when: no pending tasks remain.
while pending tasks:
status → {ready, stalled, stall_info}
If stalled → recovery (below)
Batch inject memory for ready tasks:
python3 "$HELIX/lib/injection.py" batch-inject --tasks '$OBJECTIVES_JSON' --limit 3
Assemble PARENT_DELIVERIES ("[task_id] summary" per delivered blocker)
Spawn builders (cap 6/wave): subagent_type="helix:helix-builder", max_turns=250
— all in ONE message, NO run_in_background
Parse DELIVERED/BLOCKED/PARTIAL → TaskUpdate outcomes
On PARTIAL: Fold REMAINING into new task next wave. Don't re-dispatch entire original. On crash: Re-dispatch once. Second crash → mark blocked.
If context was compressed, first re-read .helix/recall_synthesis.json for prior CONSTRAINTS and RISK_AREAS.
Recall insights about the blocked area: python3 "$HELIX/lib/memory/core.py" recall "{blocked_task_description}" --limit 5 --graph-hops 1
Then analyze:
Not optional. You see cross-task patterns builders cannot. Exit when: at least one insight stored (or user dismisses).
Review all outcomes. Collect per task: exact outcome text, relevant_files, verify command, retry count, errors. Note cross-task patterns. Formulate hypotheses. Do not store yet.
For BLOCKED tasks, check insight ancestry if insights were injected:
python3 "$HELIX/lib/memory/core.py" neighbors "{insight_name}" --relation led_to --limit 3
If the injected insight has led_to provenance from low-effectiveness ancestors, note this — the insight lineage may be propagating an error pattern.
Present observations to user via AskUserQuestion -- they hold domain knowledge inaccessible to the system.
When to ask: Any BLOCKED/PARTIAL -- yes (highest learning value). All DELIVERED multi-task -- yes (approach insights). Fast-path single DELIVERED -- skip.
Question construction rules:
tests/auth/test_oauth.py timed out."BLOCKED/PARTIAL example:
AskUserQuestion([{
question: "Builder for '003: migrate-auth-tokens' was BLOCKED: 'ConnectionTimeout after 30s in tests/auth/test_oauth.py:42 — OAuth provider unreachable'. Files: src/auth/tokens.py, tests/auth/test_oauth.py. Verify was: pytest tests/auth/ -k oauth_migration. Most likely cause?",
header: "Root cause: 003",
options: [
{label: "Missing mock", description: "test_oauth.py hits real OAuth endpoint — ConnectionTimeout suggests no mock configured for this test flow"},
{label: "Network/env config", description: "OAuth provider URL may be wrong in test config — 30s timeout implies connection attempt, not auth failure"},
{label: "Dependency ordering", description: "Token migration requires auth-service running — another task should have set up test fixtures first"}
],
multiSelect: false
}])
All DELIVERED (with friction) example:
AskUserQuestion([{
question: "All 4 tasks delivered. '002: refactor-auth-middleware' needed 2 attempts — first failed on tests/middleware/test_chain.py (assertion: expected 3 middleware layers, got 2). After stall recovery, builder added missing CORS layer. Is this a known constraint?",
header: "Reflection: 002",
options: [
{label: "Document constraint", description: "Middleware chain order matters — CORS must be explicit. The layer-count assertion in test_chain.py is the contract"},
{label: "Test was brittle", description: "test_chain.py counts layers instead of asserting behavior — breaks on any refactor that changes layer count"},
{label: "All good", description: "Stall recovery handled it correctly, nothing to remember"}
],
multiSelect: false
}])
User selects option or types "Other": Combine observation with their answer. Tag user-provided.
python3 "$HELIX/lib/memory/core.py" store \
--content "When modifying auth middleware in src/auth/middleware.py, always include explicit CORS layer — test_chain.py validates 3-layer stack and implicit CORS from Flask-CORS doesn't count" \
--tags '["user-provided", "auth", "middleware"]'
User dismisses: Fall back to your own cross-task observations. Store without user-provided tag.
Skipped ask (fast-path): Store your own observations directly.
Procedure graduation: If stall recovery revealed a multi-step fix sequence, store as a procedure:
python3 "$HELIX/lib/memory/core.py" store \
--content "Check pytest fixtures in conftest.py\nEnsure test DB initialized before migration tests\nRun migrations with --check flag before applying" \
--tags '["procedure", "testing", "database"]'
Procedures render as numbered steps when injected and decay/prune like any other insight.
Insights auto-link (similarity >= 0.60) and provenance edges form during extraction. Test: would this help 3 months from now? Minimum: one insight per session.
Summarize: tasks delivered, tasks blocked, insights stored (noting which were user-informed). If all tasks blocked, surface the pattern.
Agent contracts in agents/*.md.