Monitor running experiments, check progress, collect results. Use when user says "check results", "is it done", "monitor", or wants experiment output.
Monitor: $ARGUMENTS
SSH server:
ssh <server> "screen -ls"
Vast.ai instance (read ssh_host, ssh_port from vast-instances.json):
ssh -p <PORT> root@<HOST> "screen -ls"
Also check vast.ai instance status:
vastai show instances
Modal (when gpu: modal in CLAUDE.md):
modal app list # List running/recent apps
modal app logs <app> # Stream logs from a running app
Modal apps auto-terminate when done — if it's not in the list, it already finished. Check results via modal volume ls <volume> or local output.
For each screen session, capture the last N lines:
ssh <server> "screen -S <name> -X hardcopy /tmp/screen_<name>.txt && tail -50 /tmp/screen_<name>.txt"
If hardcopy fails, check for log files or tee output.
ssh <server> "ls -lt <results_dir>/*.json 2>/dev/null | head -20"
If JSON results exist, fetch and parse them:
ssh <server> "cat <results_dir>/<latest>.json"
wandb: true in CLAUDE.md)Skip this step entirely if wandb is not set or is false in CLAUDE.md.
Pull training curves and metrics from Weights & Biases via Python API:
# List recent runs in the project
ssh <server> "python3 -c \"
import wandb
api = wandb.Api()
runs = api.runs('<entity>/<project>', per_page=10)
for r in runs:
print(f'{r.id} {r.state} {r.name} {r.summary.get(\"eval/loss\", \"N/A\")}')
\""
# Pull specific metrics from a run (last 50 steps)
ssh <server> "python3 -c \"
import wandb, json
api = wandb.Api()
run = api.run('<entity>/<project>/<run_id>')
history = list(run.scan_history(keys=['train/loss', 'eval/loss', 'eval/ppl', 'train/lr'], page_size=50))
print(json.dumps(history[-10:], indent=2))
\""
# Pull run summary (final metrics)
ssh <server> "python3 -c \"
import wandb, json
api = wandb.Api()
run = api.run('<entity>/<project>/<run_id>')
print(json.dumps(dict(run.summary), indent=2, default=str))
\""
What to extract:
W&B dashboard link (include in summary for user):