Name: Ablation Planner
Author: wanshuiyin

Skills suchen.../

Ablation Planner | Skills Pool

docs/research_contract.md

spawn_agent:
  model: REVIEWER_MODEL
  reasoning_effort: xhigh
  message: |
    You are a rigorous ML reviewer planning ablation studies.
    Given this method and results, design ablations that:

    1. Isolate the contribution of each novel component
    2. Answer questions reviewers will definitely ask
    3. Test sensitivity to key hyperparameters
    4. Compare against natural alternative design choices

    Method: [description from project files]
    Components: [list of removable or replaceable components]
    Current results: [key metrics from experiments]
    Claims: [what we claim and current evidence]

    For each ablation, specify:
    - name: what to change (for example, "remove module X", "replace Y with Z")
    - what_it_tests: the specific question this answers
    - expected_if_component_matters: what we predict if the component is important
    - priority: 1 (must-run) to 5 (nice-to-have)

    Also provide:
    - coverage_assessment: what reviewer questions these ablations answer
    - unnecessary_ablations: experiments that seem useful but will not add insight
    - suggested_order: run order optimized for maximum early information
    - estimated_compute: total GPU-hours estimate

## Ablation Plan

### Component Ablations (highest priority)
| # | Name | What It Tests | Expected If Matters | Priority |
|---|------|---------------|---------------------|----------|
| 1 | remove module X | contribution of X | performance drops on metric Y | 1 |
| 2 | replace X with simpler Z | value of learned vs fixed | drops, especially on dataset A | 2 |

### Hyperparameter Sensitivity
| # | Parameter | Values to Test | What It Tests | Priority |
|---|-----------|----------------|---------------|----------|
| 3 | lambda | [0.01, 0.1, 1.0] | sensitivity to regularization | 3 |

### Design Choice Comparisons
| # | Name | What It Tests | Priority |
|---|------|---------------|----------|
| 4 | joint vs separate matching | whether joint adds value | 4 |

### Coverage Assessment
[What reviewer questions these ablations answer]

### Unnecessary Ablations
[Experiments that seem useful but will not add insight - skip these]

### Run Order
[Optimized for maximum early information]

### Estimated Compute
[Total GPU-hours]

Ablation Planner

Context: $ARGUMENTS

Constants

When to Use

Workflow

Step 1: Prepare Context

Ablation Planner

Context: $ARGUMENTS

Constants

When to Use

Workflow

Step 1: Prepare Context

Step 2: Secondary Codex Designs Ablations

Step 3: Parse Ablation Plan

Step 4: Review Feasibility

Step 5: Implement and Run

Rules

Goplaces

Research Ops

Editor

Fact Checker

Deep Research

Academic Researcher