Skill ファイル

Skill: Experiment Designer

Name: Skill: Experiment Designer
Author: indirected

Design a concrete experiment protocol for any research hypothesis. Reads dataset paths, metric definitions, and run command template from project/experiment-config.md. Trigger when the user says things like: "design experiments", "plan experiments for [hypothesis]", "what should I test", "help me design an experiment", "experimental setup", "which baselines should I include", "design ablation study", "what models to compare", "how do I test this idea", "I want to study [variable] effect", "what conditions should I run", "structure my research around [hypothesis]", "help me think through the experimental design", "I have a hypothesis, what do I run", "what datasets and models should I use", "give me a research plan", "what's the right experimental protocol".

indirected0 スター2026/04/10

職業
カテゴリ: 教育

スキル内容

Design a complete, concrete experiment protocol for a research hypothesis. Given a hypothesis (from the user or from an existing hypothesis file), this skill decomposes it into conditions, baselines, metrics, compute estimates, and an ablation plan, then writes a structured plan file.

Reads all project-specific configuration from project/experiment-config.md.

Step 0 — Check Prerequisites

Read project config:

Read: project/experiment-config.md
Read: project/research-focus.md
Read: project/system-design.md    (if exists — for ablation design)

If project/experiment-config.md does not exist, stop and tell the user:

"I need project/experiment-config.md to design experiments. Please run the project-init skill first, or create this file with at minimum:

Skill: Experiment Designer

indirected0 スター2026/04/10

職業
カテゴリ: 教育

スキル内容

Reads all project-specific configuration from project/experiment-config.md.

Step 0 — Check Prerequisites

Read project config:

Read: project/experiment-config.md
Read: project/research-focus.md
Read: project/system-design.md    (if exists — for ablation design)

If project/experiment-config.md does not exist, stop and tell the user:

"I need project/experiment-config.md to design experiments. Please run the project-init skill first, or create this file with at minimum:

関連 Skill

## Metrics
### Primary Metric
name: [metric_name]
definition: [what it measures]

## Datasets
### [dataset name]
path: [path or TODO]

## Run Command Template
```bash
[your run command]

Glob("experiments/hypothesis*.md")  # sorted by modification time; show the 5 most recent

## Experimental Decomposition

**Independent variable(s):**
  - [What you are changing between conditions]

**Dependent variable(s):**
  - Primary: [from project/experiment-config.md primary metric]
  - Secondary: [from project/experiment-config.md secondary metrics]
  - Exploratory: [any other measurable outputs from the run]

**Controls (held constant across all conditions):**
  - Dataset: [which dataset from project/experiment-config.md]
  - Model: [if not the IV]
  - [Other settings from run command template that are held fixed]

Available datasets:
  [from project/experiment-config.md Datasets section]

Available LLM providers / models:
  [ask user, or list options from the run command template]

Existing run results:

Glob("experiments/runs/*/stats.json")  # count the results to see how many runs exist

## Proposed Conditions

| Condition | Description | Key Change from Baseline | Dataset | Model |
|---|---|---|---|---|
| C1 (Baseline) | Default settings | none — establishes baseline | [dataset] | [model] |
| C2 (Treatment) | [hypothesis treatment] | [specific change] | [dataset] | [model] |
| C3 | [variant] | [specific change] | [dataset] | [model] |

## Metrics Plan

### Primary (report in main table)
- [primary_metric_name]: [definition from config]
  Denominator: [from config]

### Secondary (report in analysis)
- [secondary metric 1]: [definition]
- [secondary metric 2]: [definition]

### Exploratory (report in appendix or supplemental)
- [any timing, iteration count, or quality signals available from the run]

### Statistical Considerations
- With [N] cases: minimum detectable effect ~[X]pp at α=0.05, power=0.80
  (Use Wilson or Clopper-Pearson confidence intervals for proportions)
  (Use Fisher's exact test for pairwise condition comparisons)

## Compute Estimate

Parameters:
  Cases per condition:    [N from dataset in config]
  Timing estimate:        [from project/experiment-config.md Timing Estimate field]
  Conditions:             [K]

Total (all conditions sequential):
  [K × timing_per_condition]

API cost estimate:
  [If LLM API is used, estimate tokens per case × price per token × N × K]
  [Otherwise note compute resource needed]

Recommendation: Run C1 (baseline) on the smallest available dataset first
to validate infrastructure before committing to full runs.

## Ablation Study Design

For each major component in project/system-design.md, propose an ablation:

| Component | Default | Ablation | Expected effect if component matters |
|---|---|---|---|
| [component 1] | [default setting] | [removed/degraded] | [expected metric change] |
| [component 2] | [default setting] | [removed/degraded] | [expected metric change] |

date +%Y%m%d

# Experiment Plan — YYYYMMDD

## Research Hypothesis
[One-paragraph statement]

## Experimental Decomposition
**Independent Variable(s):** [list]
**Dependent Variables:** [primary metric] (primary), [secondary metrics] (secondary)
**Controls:** [held-constant settings]

## Conditions

### C1 — Baseline
**Description:** Default settings
**Run command:**
```bash
[from project/experiment-config.md run command template, with substitutions]


---

## Error Handling

If the run command template in `project/experiment-config.md` has "TODO" placeholders,
note them in the plan but do not block on them — the plan can still be written.

If the user's hypothesis references a metric name not in `project/experiment-config.md`,

Skill: Experiment Designer

Step 0 — Check Prerequisites

Skill: Experiment Designer

Step 0 — Check Prerequisites

Step 1 — Discover or Elicit the Hypothesis

Step 2 — Decompose the Hypothesis

Step 3 — Scan Available Assets

Step 4 — Propose Experimental Conditions

Step 5 — Suggest Baselines from Literature

Step 6 — Define Success Metrics

Step 7 — Estimate Compute

Step 8 — Propose Ablation Conditions

Step 9 — Write the Plan File

C2 — [Treatment name]

Baselines

Metrics

Statistical Analysis Plan

Compute Estimate

Ablation Conditions

Run Order

Notes

Update Skills

Eval Harness

Ecc Tools Cost Audit

Code Tour

Rules Distill

Design System