For executing well-defined research experiments specified in a task spec. Use this skill when the task spec defines exactly what experiments to run — hyperparameter sweeps, probe training on specific layers, activation collection, evaluation on specific benchmarks, or any research task where the experimental design is already determined. The user has decided what to do; your job is to execute it correctly, efficiently, and thoroughly. Always use alongside the autonomous-execution skill.
For tasks where the experiment design is already defined. The task spec tells you what to run — your job is execution, not design.
Before starting, read:
../autonomous-execution/SKILL.md (general task discipline)../research-common/references/experiment-discipline.md (experiment rigour)The value you add here is in efficiency, correctness, and thoroughness — not creativity. Get the runs done cleanly, save results properly, report clearly.
Sometimes a spec is mostly well-defined but leaves some details unspecified (batch size, learning rate, number of epochs). In these cases: