Autonomous, iterative improvement inspired by Karpathy's autoresearch methodology. Define a metric, set a target, and let the loop run until the target is met or the iteration limit is reached.

The 5-Step Loop

1. HYPOTHESIZE  -> Form a specific, falsifiable improvement hypothesis
2. MODIFY       -> Apply the minimal code/config/prompt change
3. TEST         -> Run the measurement suite (benchmarks, tests, evals)
4. EVALUATE     -> Compare result against baseline and previous best
5. DECIDE       -> KEEP if better, DISCARD (git stash pop --index) if worse
      |
   Repeat until target met OR max_iterations reached

Each iteration is atomic: one hypothesis, one change, one measurement, one decision.

Experiment Definition

Define an experiment in your task or in thoughts/EXPERIMENTS.md:

Autonomous, iterative improvement inspired by Karpathy's autoresearch methodology. Define a metric, set a target, and let the loop run until the target is met or the iteration limit is reached.

The 5-Step Loop

1. HYPOTHESIZE  -> Form a specific, falsifiable improvement hypothesis
2. MODIFY       -> Apply the minimal code/config/prompt change
3. TEST         -> Run the measurement suite (benchmarks, tests, evals)
4. EVALUATE     -> Compare result against baseline and previous best
5. DECIDE       -> KEEP if better, DISCARD (git stash pop --index) if worse
      |
   Repeat until target met OR max_iterations reached

Each iteration is atomic: one hypothesis, one change, one measurement, one decision.

Experiment Definition

Define an experiment in your task or in thoughts/EXPERIMENTS.md:

Experiment Loop

The 5-Step Loop

Experiment Definition

Experiment Loop

The 5-Step Loop

Experiment Definition

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio