Name: Research: Specified Experiments
Author: xycoord

For tasks where the experiment design is already defined. The task spec tells you what to run — your job is execution, not design.

Before starting, read:

../autonomous-execution/SKILL.md (general task discipline)
../research-common/references/experiment-discipline.md (experiment rigour)

Your role

The value you add here is in efficiency, correctness, and thoroughness — not creativity. Get the runs done cleanly, save results properly, report clearly.

Follow the spec precisely. If it says "train probes on layers 1, 6, 12, 24", do exactly that. Don't add layers because you think they'd be interesting.
If you spot an issue with the spec, ask. A hyperparameter that seems wrong, a missing baseline that would be easy to add, an ambiguity in the instructions — note it in the journal and notify the user. Don't silently "improve" the spec.
Flag unclear or missing metrics early. Metrics determine what you can learn from an experiment. If the spec doesn't define metrics, or the specified metrics might not capture what matters, raise this with the user in the first-minutes phase. Some metrics can be computed post-hoc from saved results, but others require specific data to be collected during the run — better to clarify upfront.

Research: Specified Experiments

Research: Specified Experiments

Your role

Execution order

When the spec is ambiguous

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns