Data Loading Traceability
Use when implementing or updating data loading so every dataset source, transform, split, and filtering decision is documented and reproducible.
thomas-deboer0 スター2026/04/04 Data Loading Traceability Skill
Use When
- Creating or modifying loaders in scratchpad,
src, or pipeline workflows.
- Integrating new datasets or changing dataset versions.
- Debugging provenance or reproducibility issues.
- Dataset location(s) and ownership.
- Version/date snapshot or retrieval timestamp.
- Split/filter/transformation rules.
- Output schema expectations.
Workflow
- Define source-of-truth path and acquisition metadata.
- Record raw-to-processed transformation steps with deterministic ordering.
- Document split logic (train/val/test) and random seed policy.
- Validate loaded schema and count summaries.
- Save trace artifacts in experiment folder (
data_lineage.md + quick stats).
- If promoted to , mirror traceability notes in module docs.