Managing Wandb

Use when working with Wandb — weights & Biases experiment tracking and ML ops management. Covers run tracking, sweep management, artifact versioning, report analysis, model registry, and team collaboration. Use when managing ML experiments, comparing training runs, analyzing hyperparameter sweeps, or auditing W&B project resources.

Occupation
Categories: Lab Tools

Weights & Biases Management Skill

Manage and monitor W&B experiments, sweeps, artifacts, and model registry.

MANDATORY: Discovery-First Pattern

Always list projects and recent runs before querying specific resources.

Phase 1: Discovery

#!/bin/bash

WANDB_ENTITY="${WANDB_ENTITY:-$(wandb whoami 2>/dev/null | head -1)}"

wandb_api() {
    local endpoint="$1"
    curl -s -H "Authorization: Bearer $WANDB_API_KEY" \
        "https://api.wandb.ai/api/v1/${endpoint}"
}

wandb_gql() {
    local query="$1"
    curl -s -X POST -H "Authorization: Bearer $WANDB_API_KEY" \
        -H "Content-Type: application/json" \
        "https://api.wandb.ai/graphql" \
        -d "{\"query\": \"$query\"}"
}

echo "=== W&B Entity: $WANDB_ENTITY ==="

echo ""
echo "=== Projects ==="
wandb_gql "{ entity(name: \\\"$WANDB_ENTITY\\\") { projects(first: 20) { edges { node { name, totalRuns, createdAt } } } } }" \
    | jq -r '.data.entity.projects.edges[].node | "\(.name)\t\(.totalRuns) runs\t\(.createdAt[0:10])"' | column -t

echo ""
echo "=== Recent Runs ==="
wandb_gql "{ entity(name: \\\"$WANDB_ENTITY\\\") { projects(first: 5) { edges { node { name, runs(first: 3, order: \\\"-created_at\\\") { edges { node { name, state, createdAt } } } } } } } }" \
    | jq -r '.data.entity.projects.edges[].node | .name as $proj | .runs.edges[].node | "\($proj)\t\(.name)\t\(.state)\t\(.createdAt[0:16])"' | column -t | head -15

Managing Wandb

Occupation
Categories: Lab Tools

MANDATORY: Discovery-First Pattern

Always list projects and recent runs before querying specific resources.

Phase 1: Discovery

#!/bin/bash WANDB_ENTITY="${WANDB_ENTITY:-$(wandb whoami 2>/dev/null | head -1)}" wandb_api() { local endpoint="$1" curl -s -H "Authorization: Bearer $WANDB_API_KEY" \ "https://api.wandb.ai/api/v1/${endpoint}" } wandb_gql() { local query="$1" curl -s -X POST -H "Authorization: Bearer $WANDB_API_KEY" \ -H "Content-Type: application/json" \ "https://api.wandb.ai/graphql" \ -d "{\"query\": \"$query\"}" } echo "=== W&B Entity: $WANDB_ENTITY ===" echo "" echo "=== Projects ===" wandb_gql "{ entity(name: \\\"$WANDB_ENTITY\\\") { projects(first: 20) { edges { node { name, totalRuns, createdAt } } } } }" \ | jq -r '.data.entity.projects.edges[].node | "\(.name)\t\(.totalRuns) runs\t\(.createdAt[0:10])"' | column -t echo "" echo "=== Recent Runs ===" wandb_gql "{ entity(name: \\\"$WANDB_ENTITY\\\") { projects(first: 5) { edges { node { name, runs(first: 3, order: \\\"-created_at\\\") { edges { node { name, state, createdAt } } } } } } } }" \ | jq -r '.data.entity.projects.edges[].node | .name as $proj | .runs.edges[].node | "\($proj)\t\(.name)\t\(.state)\t\(.createdAt[0:16])"' | column -t | head -15

Shortcut	Counter	Why
"I'll skip discovery and check known resources"	Always run Phase 1 discovery first	Resource names change, new resources appear — assumed names cause errors
"The user only asked for a quick check"	Follow the full discovery → analysis flow	Quick checks miss critical issues; structured analysis catches silent failures
"Default configuration is probably fine"	Audit configuration explicitly	Defaults often leave logging, security, and optimization features disabled
"Metrics aren't needed for this"	Always check relevant metrics when available	API/CLI responses show current state; metrics reveal trends and intermittent issues
"I don't have access to that"	Try the command and report the actual error	Assumed permission failures prevent useful investigation; actual errors are informative

Managing Wandb

Weights & Biases Management Skill

MANDATORY: Discovery-First Pattern

Phase 1: Discovery

Managing Wandb

Weights & Biases Management Skill

MANDATORY: Discovery-First Pattern

Phase 1: Discovery

Core Helper Functions

Output Rules

Common Operations

Run Tracking and Comparison

Sweep Management

Artifact Versioning

Run Detail and Metrics

Report Analysis

Safety Rules

Output Format

Anti-Hallucination Rules

Counter-Rationalizations

Common Pitfalls

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio