Domain-agnostic autonomous optimization loop. Try ideas, measure results, keep what works, discard what doesn't, repeat forever. Works for any quantifiable metric — ML training, build speed, bundle size, test performance, etc. Use when asked to "optimize X in a loop", "autoresearch for X", or "autonomous optimization".
Adapted from davebcn87/pi-autoresearch. A domain-agnostic optimization loop that works for any quantifiable metric.
Try an idea → measure it → keep what works → discard what doesn't → repeat forever.
Unlike the ML-specific autoresearch skill, this works for any optimization target: build times, bundle size, test speed, inference latency, memory usage, code quality metrics, etc.
Ask or infer from the user:
npm run build)Create branch: git checkout -b autoresearch/<goal>-<date>
Read source files deeply before writing anything.
Create autoresearch.md — the heart of the session:
# Autoresearch: <goal>
## Objective
<Specific description of what we're optimizing and the workload.>
## Metrics
- **Primary**: <name> (<unit>, lower/higher is better)
- **Secondary**: <name>, <name>, ...
## How to Run
`./autoresearch.sh` — outputs METRIC name=number lines.
## Files in Scope
<Every file the agent may modify, with a brief note on what it does.>
## Off Limits
<What must NOT be touched.>
## Constraints
<Hard rules: tests must pass, no new deps, etc.>
## What's Been Tried
<Update this section as experiments accumulate. Note key wins, dead ends,
and architectural insights so the agent doesn't repeat failed approaches.>
Create autoresearch.sh — benchmark script:
#!/bin/bash
set -euo pipefail
# Pre-check (fast, <1s)
<syntax/lint check>
# Run benchmark
<actual benchmark command>
# Output metrics
echo "METRIC build_time=12.3"
echo "METRIC bundle_size=450"
Optional: Create autoresearch.checks.sh — correctness validation:
#!/bin/bash
set -euo pipefail
# Only needed if constraints require correctness checks
npm test --run 2>&1 | tail -50
Run baseline → log results → start looping immediately.
LOOP FOREVER:
autoresearch.md and past results.git commit the change../autoresearch.sh and capture output.METRIC lines from output.autoresearch.checks.sh exists, run it. Failed → log as checks_failed, revert.git reset --hard HEAD~1).autoresearch.md "What's Been Tried" section periodically.autoresearch.md exists, read it + git log, continue looping.| File | Purpose | Committed? |
|---|---|---|
autoresearch.md | Living document: goals, constraints, what's been tried | Yes |
autoresearch.sh | Benchmark script | Yes |
autoresearch.checks.sh | Correctness checks (optional) | Yes |
autoresearch.jsonl | Append-only experiment log (one JSON per run) | No |
Each line in autoresearch.jsonl:
{"run": 1, "commit": "a1b2c3d", "status": "keep", "metrics": {"build_time": 12.3}, "description": "baseline"}
{"run": 2, "commit": "b2c3d4e", "status": "keep", "metrics": {"build_time": 11.1}, "description": "enable parallel compilation"}
{"run": 3, "commit": "c3d4e5f", "status": "discard", "metrics": {"build_time": 13.0}, "description": "switch to esbuild"}
For the original implementation: third-party/pi-autoresearch/skills/autoresearch-create/SKILL.md