Name: Diagnose System
Author: ZhangHanbo

搜索技能.../

Diagnose System | Skills Pool

# Training / policy run (placeholder)
python scripts/run_policy.py --config configs/minimal.yaml \
    --seeds 0 1 2 --n_trials 20

# Simulation-only (placeholder)
mjpython scripts/eval_sim.py --config configs/minimal.yaml --n_trials 20

# Real-robot (placeholder)
roslaunch alpha_research eval_real.launch config:=configs/minimal.yaml

# From wandb
PYTHONPATH=src python -c "
import wandb, json
api = wandb.Api()
runs = api.runs('alpha_research/minimal', filters={'config.config_name': 'minimal.yaml'})
results = [{
    'run_id': r.id,
    'success_rate': r.summary.get('success_rate'),
    'n_trials': r.summary.get('n_trials'),
    'failure_reasons': r.summary.get('failure_reasons'),
    'seed': r.config.get('seed'),
} for r in runs[:6]]
print(json.dumps(results, indent=2))
"

python scripts/audit_stats.py logs/minimal_run/ --venue RSS

BAD (reject)	GOOD (accept)
"grasping fails"	"grasping fails on objects <2mm thick because the depth camera has 3mm resolution at working distance, so the gripper closes on empty space"
"the policy doesn't generalize"	"the visual encoder maps objects of similar color to nearby features despite different shapes, so the policy executes the mean action and fails on asymmetric objects"
"planning is too slow"	"collision checking dominates (78% of wall-clock time); each check requires full forward kinematics on a 7-DOF arm (~2ms); total plan time 1.8s at 500Hz fk calls"

PYTHONPATH=src python -c "
from alpha_research.records.jsonl import read_records
from pathlib import Path
import json
recs = read_records(Path('<project_dir>'), 'formalization_check')
print(json.dumps(recs[-1] if recs else None, indent=2))
"

PYTHONPATH=src python -c "
from alpha_research.records.jsonl import append_record
from pathlib import Path
import json, sys
rid = append_record(Path(sys.argv[1]), 'diagnosis', json.loads(sys.stdin.read()))
print(rid)
" "<project_dir>" <<< '<diagnosis_json>'

{
  "system_config": "configs/minimal.yaml",
  "n_trials": 60,
  "success_rate": 0.35,
  "failure_taxonomy": {
    "perception": 18,
    "planning": 12,
    "execution": 9,
    "physics": 0,
    "spec": 0
  },
  "specific_failures": [
    {
      "trial": 3,
      "type": "perception",
      "description": "depth camera could not resolve the 1.5mm bolt head at 40cm working distance; detected object center 4mm offset from truth; grasp aimed at empty space"
    },
    {
      "trial": 7,
      "type": "execution",
      "description": "motor saturation during 0.2N target force; actual force peaked at 1.1N before safety abort; PID tuned for free-space motion, not contact"
    }
  ],
  "failure_to_formalism_map": {
    "depth_resolution": "observation model P(z|s) insufficient for state dim h (object height)",
    "motor_saturation": "dynamics model assumed free-space; contact-regime dynamics missing"
  },
  "unmapped_failures": [],
  "dominant_failure_mode": "perception — insufficient depth resolution",
  "suggested_next_stage": "CHALLENGE",
  "backward_trigger": null,
  "human_review_required": ["physical_intuition_on_edge_cases"]
}

Diagnose System

When to use

Process

Step 1 — Locate the minimal system config

Diagnose System

When to use

Process

Step 1 — Locate the minimal system config

Step 2 — Run the experiment (LAB-SPECIFIC — customize)

Step 3 — Collect results

Step 4 — Classify failures into the taxonomy

Step 5 — Write SPECIFIC failure descriptions (CRITICAL)

Step 6 — Map failures to the formal structure

Step 7 — Persist

Output format

Honesty protocol

References

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags