Long-running autonomous development orchestration for 12+ hour runs. Trigger when user asks to: build/refactor/fix software end-to-end, run autonomously, execute a dependency graph, continue work without constant interaction, use full/standard/quick Tenet execution modes, or steer an ongoing autonomous run. Also triggers on: 'tenet', 'autonomous loop', 'long run', 'keep going', 'run overnight', 'execute the plan', 'start building'.
Execute this file as an operational program. Be decisive, deterministic, and checkpoint-driven.
tenet_compile_context), never raw file dumps..tenet/ markdown files.tenet_continue()), not ad-hoc ID reconstruction.The MCP server is auto-started by the host platform via project config files
(.mcp.json for Claude Code, opencode.json for OpenCode). These are created
by npx tenet init. No manual server launch is needed.
Ensure Tenet project state exists:
tenet_continue().tenet_init(project_path=".").Verify MCP health:
tenet_health_check().If health check fails (MCP server unreachable):
npx tenet init in the project root, then restart your agent."Read current state summary:
tenet_get_status().Detect brownfield project (existing codebase without prior Tenet state):
.tenet/ was just created (fresh init) AND the project directory contains existing source code (look for src/, lib/, app/, package.json, requirements.txt, go.mod, Cargo.toml, or similar), this is a brownfield project.phases/00-brownfield-scan.md and execute the codebase scan before proceeding to any mode selection or crystallization phase..tenet/bootstrap/codebase-scan.md — a structured summary of the existing codebase that feeds into interview and spec phases..tenet/ already existed (resuming), skip the scan.Detect git repository:
.git/ exists in the project root.Check Playwright MCP availability:
playwright_navigate). If it responds, Playwright MCP is available.Do not proceed into execution until health is good.
Choose exactly one mode at start; re-evaluate at major scope changes.
.tenet/ harness/spec quality.Use when: new feature with unclear edges, major refactor, greenfield, broad multi-module change.
Flow:
Use when: medium complexity, known architecture, moderate unknowns.
Flow:
Use when: small isolated bug/config/content tweak with low ambiguity.
Flow:
tenet_register_jobs (single-entry DAG)tenet_start_job → tenet_job_wait → tenet_job_resulttenet_start_eval (code critic + test critic)tenet_retry_jobQuick mode skips interview, spec generation, visual artifacts, and decomposition planning.
It does NOT skip the MCP execution pipeline or evaluation gates. All jobs go through
the same start_job → wait → result → eval flow regardless of mode.
Run before decomposition when in Full mode. Each step below requires reading a detailed reference doc FIRST. Do NOT skip the read — the reference contains exact file paths, formats, and enforcement rules that this summary omits.
Read phases/01-interview.md before executing.
.tenet/interview/{date}-{feature}.md (e.g. 2026-04-08-oauth.md). Derive the feature slug from the user's project description early in the interview.tenet_validate_clarity() to get an independent clarity score.tenet_job_wait + tenet_job_result.Do NOT self-score the interview. The validation must come from a separate agent context. Do NOT re-validate clarity after the interview — later phases have their own validation.
Read phases/03-visuals.md before executing.
.tenet/visuals/..tenet/spec/scenarios-{date}-{feature}.md.Read phases/02-spec-and-harness.md Section 0 before executing.
tenet_update_knowledge(type="knowledge", title="research-{topic}").Read phases/02-spec-and-harness.md before executing.
.tenet/spec/{date}-{feature}.md (NOT .tenet/spec.md or .tenet/spec/spec.md)..tenet/harness/current.md (NOT .tenet/harness.md)..tenet/spec/scenarios-{date}-{feature}.md.tenet_validate_readiness(feature="{feature}") after spec + harness are written.tenet_job_wait + tenet_job_result.passed: false:
tenet_validate_readiness until it passes.tenet_validate_clarity — clarity validated the user requirements; readiness validates the implementation prerequisites (creds, env, contracts, fixtures, test strategy).Read phases/04-decomposition.md before executing.
.tenet/decomposition/{date}-{feature}.md (NOT .tenet/spec/decomposition.md).job-queue.md, status.md) are auto-generated from the DB on state transitions.tenet_register_jobs to load the DAG into the runtime queue.tenet_register_jobs.Quick mode is "quick" because it skips interview/spec/decomposition overhead, NOT because it bypasses the execution pipeline. The job MUST still go through tenet_start_job, tenet_start_eval, and all eval gates.
YOLO mode applies to the crystallization phase (interview → spec → visuals → decomposition), NOT to execution. When enabled, the agent makes all upfront decisions without asking the user questions — it decides feature scope, acceptance criteria, tech choices, and test strategy autonomously.
YOLO mode is triggered when the user says "yolo", "just decide everything", or "don't ask me questions" during or before the interview phase.
What YOLO mode skips: Interview questions, spec confirmation, visual approval, decomposition review. What YOLO mode does NOT skip: Pre-execution confirmation, evaluation gates, steer message processing. These always run.
Before entering the autonomous execution loop, present the user with a summary for confirmation:
Skip this gate ONLY if the user has explicitly said "just do it" / "start building" without wanting oversight.
Read phases/05-execution-loop.md before executing. It contains the exact tool call sequence with concrete examples.
Use this control flow exactly. Worker execution is performed by MCP-dispatched agents; orchestrator only uses tenet_* tools. Do NOT call subagents directly — use tenet_start_job to dispatch all work.
tenet_continue() returns the next actionable job from the DAG and current session state. The server tracks what's done, what's blocked, and what's ready.
CRITICAL: Non-blocking execution. tenet_job_wait must be dispatched as a background task (not foreground). This keeps the orchestrator available for user interaction and steer messages while jobs execute.
# jobs_completed_since_last_health = 0
while True:
# 1. Steering checkpoint
steer = tenet_process_steer()
IF steer.has_emergency:
HALT — cancel active jobs, process emergency, wait for user
IF steer.has_directive:
apply directive (reorder queue, add/remove jobs, update spec)
# 2. Get next job from server-managed DAG
continuation = tenet_continue()
IF continuation.all_done:
BREAK — run complete
IF continuation.all_blocked:
BREAK — report blocked jobs, wait for user steer
job = continuation.next_job
# 3. Compile bootstrap context for this job
compiled_context = tenet_compile_context(job_id=job.id)
# 4. Dispatch registered job for execution
run = tenet_start_job(job_id=job.id)
# 5. Brief user and start background status check
TELL USER: "Dispatched: {job.name}. I'll monitor in the background."
TELL USER: "You can send messages or steer directives while this runs."
check = BACKGROUND tenet_job_wait(job_id=run.job_id)
# 6. When background check returns (instant — no blocking):
# - If is_terminal=false: check steer, brief user, wait, then re-check
# - If is_terminal=true: proceed to result collection
# Wait strategy: start at 30s, increase by 1.5x each cycle, cap at 120s
poll_delay = 30
WHILE check result is not terminal:
result = COLLECT check
tenet_process_steer()
TELL USER: "{job.name}: {result.progress_line}"
SLEEP poll_delay seconds
poll_delay = min(poll_delay * 1.5, 120)
check = BACKGROUND tenet_job_wait(job_id=run.job_id, cursor=result.cursor)
# 7. Retrieve full output
output = tenet_job_result(job_id=run.job_id)
# 8. Dispatch evaluation (code critic + test critic + Playwright eval)
eval = tenet_start_eval(job_id=job.id, output=output)
# This dispatches THREE jobs: code_critic, test_critic, playwright_eval
code_check = BACKGROUND tenet_job_wait(job_id=eval.code_critic_job_id)
test_check = BACKGROUND tenet_job_wait(job_id=eval.test_critic_job_id)
playwright_check = BACKGROUND tenet_job_wait(job_id=eval.playwright_eval_job_id)
# Wait for all three eval jobs
eval_delay = 30
WHILE any of (code_check, test_check, playwright_check) not terminal:
SLEEP eval_delay seconds
eval_delay = min(eval_delay * 1.5, 120)
IF code_check not terminal:
code_check = BACKGROUND tenet_job_wait(job_id=eval.code_critic_job_id)
IF test_check not terminal:
test_check = BACKGROUND tenet_job_wait(job_id=eval.test_critic_job_id)
IF playwright_check not terminal:
playwright_check = BACKGROUND tenet_job_wait(job_id=eval.playwright_eval_job_id)
code_output = tenet_job_result(job_id=eval.code_critic_job_id)
test_output = tenet_job_result(job_id=eval.test_critic_job_id)
playwright_output = tenet_job_result(job_id=eval.playwright_eval_job_id)
# 9. Act on eval results — ALL THREE must pass
# ⛔ EVAL IS A HARD BLOCKING GATE — DO NOT PROCEED TO THE NEXT JOB IF EVAL FAILS
# "The next job will fix it" is NEVER acceptable. Retry THIS job until it passes.
IF code_output.passed AND test_output.passed AND playwright_output.passed:
tenet_update_knowledge(type="journal", job_id=job.id, findings=output.findings)
ELIF NOT playwright_output.passed:
# Playwright e2e failed — actual app behavior is broken
# Create fix job with the screenshots and findings as evidence
create_fix_job(job, playwright_output.exploratory_findings)
# DO NOT continue to next job — wait for fix job to complete, then re-eval
ELIF NOT test_output.passed:
# Test critic failed — tests are insufficient, create fix job to strengthen tests
create_test_fix_job(job, test_output.missing_tests)
# DO NOT continue to next job — wait for fix job to complete, then re-eval
ELSE:
# Code critic failed — retry the job (preferred) or create new job if approach is wrong
tenet_retry_job(job_id=job.id) # preferred over creating new job
# DO NOT continue to next job — wait for retry to complete, then re-eval
# 11. Post-job steering checkpoint
tenet_process_steer()
# 12. Periodic health audit (every 3 completed jobs)
jobs_completed_since_last_health += 1
IF jobs_completed_since_last_health >= 3:
tenet_health_check()
jobs_completed_since_last_health = 0
Key difference from a blocking loop: Each tenet_job_wait is dispatched as a background task. When it returns, the host fires a notification. Between notifications, the user can interact with the orchestrator. The orchestrator checks steer messages on each notification cycle.
Before every job, tenet_compile_context(job_id) must produce a compiled view pipeline:
Never bypass compiled context.
Read phases/06-evaluation.md before executing. It contains exact stage definitions, output format, and the author/critic separation rules.
Evaluate every completed job using staged gates:
playwright_eval job alongside code critic and test critic.npx playwright test).tenet_update_knowledge supports two entry types stored in separate directories:
.tenet/knowledge/)Reusable technical wisdom that helps future agents working on similar features. Examples:
Use type: "knowledge" and tag with confidence:
| Tag | Meaning |
|---|---|
[implemented-and-tested] | Code exists and passes tests |
[implemented-not-tested] | Code exists but tests are missing or incomplete |
[decision-only] | Agreed approach, not yet coded |
[scanned-not-verified] | Extracted from existing code during brownfield scan, not validated |
.tenet/journal/)Activity logs, job completion summaries, and session progress notes. Written after every job completion to track what happened. Use type: "journal" (default).
Rule of thumb: If a future agent working on a different feature would benefit from this information, it's knowledge. If it's only useful for tracking what happened in this session, it's journal.
Run cascade checks when upstream state changes:
EVAL FAILURE = HARD BLOCK. Do NOT move to the next job. Do NOT say "this will be addressed later." The current job must pass eval before the DAG advances.
On eval fail:
tenet_update_knowledge(type="journal", title="failure-{job_name}-trial-{N}") with details of what was tried and why it failed.Prefer tenet_retry_job over creating a new job. Retry preserves job lineage and retry count tracking. Only create a new job when the approach is fundamentally wrong and a different scope/strategy is needed.
On max retries exhausted (3 failures):
On eventual success after failures: Gather all failure journals for this job, extract the lesson learned, and write it to knowledge (not journal) via tenet_update_knowledge(type="knowledge").
Then retry under stagnation and safety gates.
Detect stagnation signals:
If stagnating, rotate persona in order:
After full rotation, allow at most 2 additional attempts. If still blocked, halt job and require steer input.
When the user sends a message during autonomous execution, the orchestrator must:
context (informational), directive (priority/scope change), emergency (halt)tenet_add_steer(content, class, affected_job_ids) to persist it in the runtime queueaffected_job_ids — otherwise leave empty for broadcastDo NOT write steer messages to markdown files. Use tenet_add_steer exclusively — this ensures proper lifecycle tracking and job targeting.
Process steer at every checkpoint via tenet_process_steer().
Message classes:
context: informational — no action required, absorbed as contextdirective: priority/order/scope changes — act on itemergency: immediate halt and containmentreceived → acknowledged → acted_on → resolved
Never leave messages silently unacknowledged. Call tenet_process_steer() at every loop iteration to pick up new messages.
Every steer message has a source field: user or agent.
source: "user".source: "agent".When a steer message targets specific jobs (via affected_job_ids), only those jobs see it in their compiled context. Broadcast messages (empty affected_job_ids) are visible to all jobs.
If tenet MCP tools stop responding (connection errors, timeouts, tool not found):
tenet_start_job / tenet_complete_job.tenet serve --project . via Bash, then retry the MCP tool call.tenet diagnose (or invoke the tenet:diagnose skill) to identify the issue.The MCP pipeline is the source of truth for job orchestration. Working outside it creates desync between git state and job state that is difficult to recover from.
Always enforce:
tenet config --max-retries <n>, default 3), then mark blocked, move to next independent job.If emergency safety breach occurs, cancel active jobs via tenet_cancel_job and process steer.
.tenet/ markdown = persistent project memory and management layer.Agent switching is managed via the CLI (tenet config set default_agent <name>), not via MCP tools.
This prevents agents from switching their own runtime mid-execution (e.g., Codex switching to OpenCode due to capacity limits).
Prefer continuity for active jobs unless explicit rerouting is required by the user.
Minimum cadence:
tenet_get_status() + tenet_health_check()tenet_health_check()tenet_get_status()If health check reports inconsistency, pause dispatch and repair state before continuing.
Stop loop only when one is true:
On stop:
tenet_get_status() reflects final statetenet_update_knowledge