Maintain a scientific experiment journal for ML trials. Use when: recording experiment results and annotations; adding config entries to configuration_suggestions.md; managing trial history and scientific notes; tracking which changes produced which results. Sub-skill of ml-experiment-loop.
Manages the scientific experiment journal and configuration tracking for ML experimentation. Ensures all changes are documented with hypotheses, results, and flags.
ml-experiment-loop skill during Phase 4| File | Purpose |
|---|---|
{model_folder}/experiment_journal.md | Scientific journal with dated entries |
configuration_suggestions.md (project root) | Config entries with machine-parseable flags |
checkpoints/summary_epoch_*.txt | Training run summaries (read-only reference) |
experiment_journal.md if it existsconfiguration_suggestions.md to know existing config IDsFor each completed trial, add a dated journal entry:
## Trial {ID} — {Date} — {Config Name}
### Setup
- **Config**: {Config ID from configuration_suggestions.md}
- **Changed params**: {param1}: {old}→{new}, {param2}: {old}→{new}
- **Run command**: `{exact command used}`
- **Duration**: {training time}
### Observations
- {observation 1 with specific numbers}
- {observation 2}
- Loss curves: {description of behavior}
- Convergence: {did it converge? at what step?}
### Metrics
| Metric | Baseline | This Trial | Delta |
|--------|----------|------------|-------|
| {metric1} | {val} | {val} | {+/-} |
| {metric2} | {val} | {val} | {+/-} |
### Analysis
{Scientific analysis of WHY the results are what they are.
Reference specific metrics, loss curves, and parameter interactions.
Compare to predictions from the hypothesis.}
### Conclusion
- **Hypothesis confirmed/refuted**: {brief statement}
- **Key insight**: {what we learned}
- **Next step**: {what to try next and why}
### Investigation Level
- [x] Parameter tuning
- [ ] Loss/metric analysis
- [ ] Architecture investigation
If new configs are proposed, add them to configuration_suggestions.md using the project's existing format:
Find the next config ID: Look at existing <!-- CONFIG_START id: X --> blocks and use the next letter
Write the CONFIG block with all required fields:
id: Next available letter (A, B, C, ... Z, AA, AB, ...)name: Descriptive namepriority: P0 (critical) to P5 (speculative)status: new for fresh proposalshypothesis: Scientific hypothesisrisk: What could go wrongmetrics_to_watch: List of metrics to monitorparams: Dict of parameter changesYAML safety for metrics_to_watch: List items that contain YAML special
characters (>, <, :, #, ', ", {, }, [, ], %, @, `)
MUST be wrapped in double quotes. Always quote to be safe — unquoted values
containing > are parsed as YAML block scalars and will silently break parsing.
Example:
metrics_to_watch:
- "IC Ret 30s (target: > 0.03, match v14's +0.059)"
- "NaN eval loss (should NOT occur with detach_above=3)"
- "BSS(live) (target: >= v17's +0.056)"
Rule: If in doubt, quote every metrics_to_watch item.
Write the human-readable section below the CONFIG block with:
If the priority table at the bottom of configuration_suggestions.md exists, update it:
For architecture changes (Phase 3 findings), use an extended format:
## Architecture Change {ID} — {Date}
### Motivation
{Why parameter/loss changes were insufficient — reference specific trials}
### Change Description
{Detailed technical description of the architecture change}
### Implementation
- **Files modified**: {list}
- **Flag**: `--{flag_name}` (default: off)
- **Backward compatible**: {yes/no}
### Literature Reference
- {Paper title, authors, year}
- {Key finding that applies to our case}
### Before/After Diagram
\```
Before: Input → [A] → [B] → [C] → Output
After: Input → [A] → [B'] → [B_new] → [C] → Output
\```
Follow the existing convention in configuration_suggestions.md:
<!-- CONFIG_START id: --> blocks before assigning