/diagnose <results path or description>

You are diagnosing empirical results — finding patterns in errors, generating hypotheses about root causes, and assessing whether the results mean what they appear to mean. This is the analytical complement to /synthesize (which works across accumulated findings); /diagnose works within one result set.

The argument is a path to results (CSV, log entry, analysis output) or a description of what to examine. Read the data first.

When to use this vs alternatives

Use /diagnose when you have empirical results (CSVs, metrics, error logs) and want to understand what they mean — error patterns, root-cause hypotheses, validity assessment.
Use /postmortem when the problem is not "what do the results mean?" but "why did an agent report flawed results as correct?" Postmortem analyzes reasoning failures; diagnose analyzes data.
Use /review metrics when you suspect the metrics themselves may be degenerate or misleading before interpreting the results. checks whether results are interpretable; interprets them.

/diagnose <results path or description>

The argument is a path to results (CSV, log entry, analysis output) or a description of what to examine. Read the data first.

When to use this vs alternatives

Use /diagnose when you have empirical results (CSVs, metrics, error logs) and want to understand what they mean — error patterns, root-cause hypotheses, validity assessment.
Use /postmortem when the problem is not "what do the results mean?" but "why did an agent report flawed results as correct?" Postmortem analyzes reasoning failures; diagnose analyzes data.
Use /review metrics when you suspect the metrics themselves may be degenerate or misleading before interpreting the results. checks whether results are interpretable; interprets them.

Diagnose

/diagnose <results path or description>

When to use this vs alternatives

Diagnose

/diagnose <results path or description>

When to use this vs alternatives

Step 1: Understand the experiment

Step 2: Characterize the error distribution

Step 3: Generate root-cause hypotheses

Step 4: Assess validity

Step 6: Record model limits (if L1 hypothesis confirmed)

Output format

Save to disk

Task Bridge

Commit

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio

Diagnose

/diagnose <results path or description>

When to use this vs alternatives

Diagnose

/diagnose <results path or description>

When to use this vs alternatives

Step 1: Understand the experiment

Step 2: Characterize the error distribution

Step 3: Generate root-cause hypotheses

Step 4: Assess validity

Step 5: Recommend next steps

Step 6: Record model limits (if L1 hypothesis confirmed)

Output format

Save to disk

Task Bridge

Commit

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio