Record and maintain Concept Encoder training and evaluation metadata. Use when preparing a run, creating train or eval git tags, monitoring training, syncing evaluation reports, updating docs/2_Experiments_Registry/master_experiment_log.md, writing run reports, or checking WandB and git linkage. Not for changelog updates or deciding experiment priorities.
Use this skill for experiment metadata and results:
Do not use this skill for generic refactors, architecture cleanup, or standalone CHANGELOG.md updates. Use engineering-change-tracking for code-change traceability.
Do not use this skill to decide which hypothesis to test next. Use research-methodology for experiment selection and interpretation.
Each tracked run should capture:
run_idgit_commit, git_tag, git_branchTraining scripts should call get_git_info() from training/utils_training.py and pass the returned values into wandb.init(config=...).
train/{run_id}_{YYYYMMDD}.git_commit, git_tag, and git_branch.docs/2_Experiments_Registry/master_experiment_log.md.analysis/run_concept_analysis.py on intermediate checkpoints when useful.docs/2_Experiments_Registry/master_experiment_log.md.docs/2_Experiments_Registry/run_reports/ when the run is non-trivial or teaches something important.eval/{benchmark}_{YYYYMMDD}.scripts/sync_evaluation_reports.ps1 when needed.docs/2_Experiments_Registry/master_experiment_log.md with scores, report links, and WandB reference.docs/4_Research_Notes/ when the run reveals a new failure mode or research insight.When summarizing a run, include:
engineering-change-tracking when the task is about code refactors, architecture edits, CHANGELOG.md, or direction shifts.huggingface-project when a promising checkpoint is ready for upload.