Autonomous evolutionary code improvement engine with tournament selection. Activate when user says: self-improve, self improve, evolve code, improve iteratively, tournament, benchmark loop, optimize code.
Autonomous loop controller for evolutionary code improvement. Manages the full lifecycle: setup, research, planning, execution, tournament selection, history recording, and stop-condition evaluation.
/omg-autopilot/ralphNEVER stop or pause to ask the user during the improvement loop. Once the gate check passes and the loop begins, run fully autonomously until a stop condition is met.
All state lives under .omc/self-improve/:
.omc/self-improve/
├── config/
│ ├── settings.json # agents, benchmark, thresholds, sealed_files
│ ├── goal.md # Improvement objective + target metric
│ ├── harness.md # Guardrail rules (H001/H002/H003)
│ └── idea.md # User experiment ideas
├── state/
│ ├── agent-settings.json # iterations, best_score, status, counters
│ ├── iteration_state.json # Within-iteration progress (resumability)
│ ├── research_briefs/ # Research output per round
│ ├── iteration_history/ # Full history per round
│ ├── merge_reports/ # Tournament results
│ └── plan_archive/ # Archived plans (permanent)
├── plans/ # Active plans (current round)
└── tracking/
├── raw_data.json # All candidate scores
├── baseline.json # Initial benchmark score
└── events.json # Config changes
| Step | Role | Agent | Purpose |
|---|---|---|---|
| Research | Codebase analysis | @explore + @architect | Hypothesis generation |
| Planning | Hypothesis → plan | @planner | Structured plan per agent |
| Architecture Review | 6-point review | @architect | Advisory review |
| Critic Review | Harness enforcement | @critic | Approve/reject plans |
| Execution | Implement + benchmark | @executor | Implement plan faithfully |
| Git Operations | Merge/tag/PR | @git-master | Atomic merge operations |
.omc/self-improve/ directory structure.agent-settings.json. Check setup flags.trust_confirmed: truegoal.md.improve/{goal_slug} from target branch.omg_write_state.Gate: All settings must be true. Execute continuously without stopping.
Remove orphaned worktrees from prior iterations.
Update state to reset TTL.
If state is cleared or status is user_stopped: exit gracefully.
Read idea.md. If non-empty, pass to planners.
Spawn @explore + @architect to analyze codebase and generate hypotheses based on goal, history, and prior briefs.
Spawn N @planner agents in parallel (N = number_of_agents). Each produces a plan with one testable hypothesis, approach_family tag, and history_reference.
For each plan:
critic_approved: true/false.For each approved plan, spawn @executor in parallel. Each executor works in a git worktree, implements the plan, runs validation, and benchmarks.
status: "success"benchmark_score (respecting direction)best_score--no-ffWrite iteration history, update agent-settings (scores, plateau count, circuit breaker), append tracking data.
Remove worktrees, update iteration state to completed.
| Condition | Check |
|---|---|
| User stop | status == "user_stopped" |
| Target reached | best_score meets/exceeds target_value |
| Plateau | plateau_consecutive_count >= plateau_window |
| Max iterations | iterations >= max_iterations |
| Circuit breaker | circuit_breaker_count >= circuit_breaker_threshold |
If NO stop condition: immediately go back to Step 1.
On invocation:
agent-settings.json:
user_stopped: ask to resumerunning: crashed — resume automaticallyidle: fresh startiteration_state.json: resume from last step if in-progress/cancel for clean state cleanupEvery plan must be tagged with exactly one:
| Tag | Description |
|---|---|
architecture | Model/component structure changes |
training_config | Optimizer, LR, scheduler, batch size |
data | Data loading, augmentation, preprocessing |
infrastructure | Mixed precision, distributed training |
optimization | Algorithmic/numerical optimizations |
testing | Evaluation methodology changes |
documentation | Documentation-only changes |
other | Does not fit above |