Comprehensive skill for ML/AI research experiments and finetuning. Use when: (1) Setting up new ML research project ("create ML project", "init experiment") (2) Finetuning models ("finetune LLM", "adapt pretrained model", "LoRA", "QLoRA") (3) Training from scratch ("train model", "run experiment") (4) Debugging ML issues ("model not converging", "loss exploding", "GPU OOM") (5) Setting up experiment tracking ("add W&B", "setup MLflow") (6) Optimizing GPU usage ("batch size tuning", "memory optimization") (7) Creating visualizations ("plot training curves", "confusion matrix") (8) Auditing ML code ("check reproducibility", "review experiment") Triggers: "ML", "machine learning", "deep learning", "training", "finetuning", "PyTorch", "TensorFlow", "experiment", "GPU", "CUDA", "model", "neural network", "W&B", "MLflow", "reproducibility", "learning rate", "checkpoint", "epoch"
This skill provides comprehensive support for ML/AI research experiments, finetuning, and training. It helps with:
Before any ML work, detect the compute environment:
1. GPU Detection
# Check for NVIDIA GPU
nvidia-smi --query-gpu=name,memory.total,driver_version --format=csv,noheader 2>/dev/null || echo "No NVIDIA GPU detected"
# Check CUDA version
nvcc --version 2>/dev/null | grep "release" || echo "CUDA not found"
# Check cuDNN (if accessible)
cat /usr/local/cuda/include/cudnn_version.h 2>/dev/null | grep CUDNN_MAJOR -A 2 || echo "cuDNN version not directly accessible"
2. Python Environment
# Python version
python3 --version
# Virtual environment detection
echo $VIRTUAL_ENV $CONDA_DEFAULT_ENV
# Check for common ML frameworks
python3 -c "import torch; print(f'PyTorch {torch.__version__}, CUDA: {torch.cuda.is_available()}')" 2>/dev/null
python3 -c "import tensorflow as tf; print(f'TensorFlow {tf.__version__}')" 2>/dev/null
python3 -c "import jax; print(f'JAX {jax.__version__}')" 2>/dev/null
3. Memory Detection
# System RAM
free -h 2>/dev/null || sysctl hw.memsize 2>/dev/null
# GPU memory (via torch if available)
python3 -c "import torch; print(f'GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB')" 2>/dev/null
4. ML Tools Detection
# Check for experiment tracking
python3 -c "import wandb; print(f'W&B {wandb.__version__}')" 2>/dev/null
python3 -c "import mlflow; print(f'MLflow {mlflow.__version__}')" 2>/dev/null
# Check for config management
python3 -c "import hydra; print(f'Hydra {hydra.__version__}')" 2>/dev/null
Run comprehensive detection:
python3 scripts/detect_system.py
See references/gpu-management.md for compute optimization.
Identify the ML task type:
| Task Type | Key Indicators | Critical Checks |
|---|---|---|
| Training from scratch | New model, random init | Data size, compute budget |
| Finetuning | Pretrained model, adaptation | Base model selection, LR schedule |
| Evaluation | Metrics, benchmarking | No data leakage, proper splits |
| Inference | Deployment, serving | Batch size, latency requirements |
Model Domain Detection:
Scale Assessment:
Check the project against ML best practices:
Run validation:
python3 scripts/validate_experiment.py
See references/common-mistakes.md for detailed issues.
When setting up an ML project, discuss with the user what structure fits their needs. A typical ML research project includes:
my_experiment/
├── src/
│ ├── __init__.py
│ ├── data/
│ │ ├── __init__.py
│ │ ├── dataset.py # Dataset classes
│ │ └── preprocessing.py # Data transforms
│ ├── models/
│ │ ├── __init__.py
│ │ └── model.py # Model definitions
│ ├── training/
│ │ ├── __init__.py
│ │ ├── trainer.py # Training loop
│ │ └── losses.py # Custom losses
│ └── utils/
│ ├── __init__.py
│ ├── logging.py # Logging utilities
│ └── reproducibility.py # Seed setting
├── configs/
│ ├── config.yaml # Main Hydra config
│ ├── model/
│ │ └── default.yaml
│ ├── data/
│ │ └── default.yaml
│ └── training/
│ └── default.yaml
├── scripts/
│ ├── train.py # Main training entry
│ ├── evaluate.py # Evaluation script
│ └── inference.py # Inference script
├── tests/
│ ├── __init__.py
│ ├── test_data.py
│ ├── test_model.py
│ └── conftest.py # pytest fixtures
├── notebooks/
│ └── exploration.ipynb
├── data/ # .gitignored
├── outputs/ # .gitignored
├── experiments/ # .gitignored
├── CLAUDE.md
├── AGENTS.md
├── README.md
├── pyproject.toml
├── .gitignore
└── .env.example
See references/project-structure.md for templates.
Hydra Config Template:
# configs/config.yaml