Collects, stores, and visualizes code quality metrics over time. Supports collect, dashboard, trend, and compare modes.
!${CLAUDE_PLUGIN_ROOT}/scripts/detect-stack.sh
Collect code quality signals, store snapshots over time, and produce dashboards with trend analysis. Execute every phase in order. Do NOT skip phases.
Follow the session protocol from session-protocol.md and the verbose-progress.md protocol. Generate a SESSION_ID, create session directory, set SESSION_TMP_DIR=".cc-sessions/${SESSION_ID}/tmp/", check for conflicting sessions, read the activity feed for recent cross-instance activity, and log skill_start to the activity feed. Print verbose progress at every phase transition, decision point, and substep per verbose-progress.md.
Extract mode from $ARGUMENTS. Default to collect if not specified.
| Mode | Description | Requirements |
|---|---|---|
collect | Gather all metrics and store snapshot | None |
dashboard | Generate markdown dashboard from latest snapshot | At least 1 snapshot |
trend | Analyze metrics over time | At least 2 snapshots |
compare <date1> <date2> | Side-by-side comparison of two snapshots | Both dates must have snapshots |
Metric collection is delegated to 5 parallel collector agents, each running one independent tool. Spawn all 5 in a single assistant message so they execute concurrently. Wall-clock drops from 2-3 min sequential to ~45 sec parallel.
| Agent | Tool | Output File | Score Formula |
|---|---|---|---|
collect-typescript | npx tsc --noEmit | ${SESSION_TMP_DIR}/metric-typescript.json | max(0, 100 - errors * 2) |
collect-lint | npx eslint . --format json | ${SESSION_TMP_DIR}/metric-lint.json | max(0, 100 - errors * 5 - warnings) |
collect-tests | npx vitest run --reporter=json (fallback: jest) | ${SESSION_TMP_DIR}/metric-tests.json | (passed / total) * 100; null if no runner |
collect-build | npm run build | ${SESSION_TMP_DIR}/metric-build.json | 100 on exit 0, else 0 |
collect-completeness | inline completeness-gate lookup | ${SESSION_TMP_DIR}/metric-completeness.json | from latest completeness snapshot; null if none |
The 2 lightweight metrics (codebase size, dependency count) stay in the orchestrator — they're simple file reads that don't warrant agent overhead.
For each collector, call the Agent tool with:
subagent_type: general-purpose (must Write JSON output — never Explore)model: sonnet (explicit)description: quality-metrics <tool> collectorprompt: the collector prompt template from reference.mdrun_in_background: falseWeight class: Light (per spawn-protocol.md). Each collector prompt declares: max 1 bash command, max 5 file reads (for parsing output), max 8 tool calls, 3-min wall-clock (typescript/tests/build may be slow on large projects — bump to 5 min for those specifically), output-file existence check.
{"score": null, "error": "..."}.Before Phase 2, verify all collector outputs:
MISSING_COUNT=0
for tool in typescript lint tests build completeness; do
f="${SESSION_TMP_DIR}/metric-${tool}.json"
if [ ! -s "$f" ]; then
echo "MISSING: $f" >&2
MISSING_COUNT=$((MISSING_COUNT+1))
fi
done
A missing collector file is treated as score: null in Phase 2 (not an abort condition — we want the snapshot written even with partial coverage). Log each missing collector to the activity feed for user visibility.
While collectors run, the orchestrator computes these in parallel (they're pure file reads):
Codebase size:
# Source files and lines
find src/ \( -name '*.ts' -o -name '*.vue' \) -not -name '*.test.*' -not -name '*.spec.*' \
> "${SESSION_TMP_DIR}/source-files.txt"
wc -l $(cat "${SESSION_TMP_DIR}/source-files.txt") 2>/dev/null | tail -1 > "${SESSION_TMP_DIR}/source-lines.txt"
# Test files and lines
find src/ \( -name '*.test.*' -o -name '*.spec.*' \) \
> "${SESSION_TMP_DIR}/test-files.txt"
wc -l $(cat "${SESSION_TMP_DIR}/test-files.txt") 2>/dev/null | tail -1 > "${SESSION_TMP_DIR}/test-lines.txt"
Calculate test-to-code ratio = test_lines / source_lines.
Dependency count:
Read package.json and count keys in dependencies (production) and devDependencies (dev).
mkdir -p docs/metrics
Write to docs/metrics/YYYY-MM-DD.json using today's date:
{
"date": "YYYY-MM-DD",
"timestamp": "<ISO-8601>",
"scores": {
"typescript": null,
"lint": null,
"tests": null,
"build": null,
"completeness": null
},
"details": {
"typescript": { "errors": 0 },
"lint": { "errors": 0, "warnings": 0 },
"tests": { "total": 0, "passed": 0, "failed": 0, "skipped": 0 },
"build": { "success": true },
"codebase": {
"source_files": 0,
"source_lines": 0,
"test_files": 0,
"test_lines": 0,
"test_ratio": 0.0
},
"dependencies": { "production": 0, "dev": 0 }
},
"overall_score": null
}
Overall score = average of all non-null scores. If all scores are null, overall_score = null.
Re-read the written file and validate it parses as valid JSON:
python3 -m json.tool docs/metrics/YYYY-MM-DD.json > /dev/null 2>&1 && echo "VALID" || echo "INVALID"
Trigger: Run this phase if mode is dashboard or collect.
Read the most recent docs/metrics/*.json by sorting filenames lexicographically (they are date-based):
ls -1 docs/metrics/*.json 2>/dev/null | sort | tail -1
If more than one snapshot exists, load the second-most-recent to compute deltas:
ls -1 docs/metrics/*.json 2>/dev/null | sort | tail -2 | head -1
Write docs/metrics/dashboard.md with the following structure:
# Quality Dashboard
**Date**: YYYY-MM-DD
**Overall Score**: XX/100
## Scorecard
| Metric | Score | Status | Delta |
|--------|-------|--------|-------|
| TypeScript | XX | PASS/WARN/FAIL | +N/-N/= |
| Lint | XX | PASS/WARN/FAIL | +N/-N/= |
| Tests | XX | PASS/WARN/FAIL | +N/-N/= |
| Build | XX | PASS/WARN/FAIL | +N/-N/= |
| Completeness | XX | PASS/WARN/FAIL | +N/-N/= |
## Status Key
- PASS: score >= 80
- WARN: score >= 50 and < 80
- FAIL: score < 50
## Details
### TypeScript
- Errors: N
### Lint
- Errors: N
- Warnings: N
### Tests
- Total: N | Passed: N | Failed: N | Skipped: N
### Build
- Status: Success/Failure
### Codebase
- Source files: N (N lines)
- Test files: N (N lines)
- Test-to-code ratio: X.XX
### Dependencies
- Production: N
- Dev: N
Status thresholds:
Delta column:
Trigger: Run this phase if mode is trend or compare.
Read all docs/metrics/*.json files, sorted by date ascending:
ls -1 docs/metrics/*.json 2>/dev/null | sort
Parse each file and collect into a time series.
trend mode: require at least 2 snapshots. If fewer, report "Insufficient data — need at least 2 snapshots for trend analysis."compare mode: require snapshots matching both requested dates. If an exact match is not found, use the nearest available date and note the substitution.For each metric, compare the last 3 snapshots (or all available if fewer than 3):
| Direction | Condition |
|---|---|
| Improving | Latest score > average of prior scores |
| Declining | Latest score < average of prior scores by > 5 points |
| Stable | Change is within 5 points |
Calculate velocity = (latest - earliest) / number_of_intervals.
Flag any metric where:
Write docs/metrics/trend-report.md:
# Quality Trend Report
**Period**: YYYY-MM-DD to YYYY-MM-DD
**Snapshots**: N
## Trend Summary
| Metric | Current | Direction | Velocity | Alert |
|--------|---------|-----------|----------|-------|
| TypeScript | XX | Improving/Declining/Stable | +X.X/snapshot | — or ALERT |
| Lint | XX | ... | ... | ... |
| Tests | XX | ... | ... | ... |
| Build | XX | ... | ... | ... |
| Completeness | XX | ... | ... | ... |
| Overall | XX | ... | ... | ... |
## Alerts
- [ALERT] <metric> declined by N points between YYYY-MM-DD and YYYY-MM-DD
- ...
## Recommendations
- <Prioritized suggestions based on trends>
If mode is compare <date1> <date2>, write a side-by-side comparison:
# Quality Comparison: YYYY-MM-DD vs YYYY-MM-DD
| Metric | Date 1 | Date 2 | Delta | Direction |
|--------|--------|--------|-------|-----------|
| TypeScript | XX | XX | +/-N | Improved/Declined/Same |
| ... | ... | ... | ... | ... |
| Overall | XX | XX | +/-N | Improved/Declined/Same |
## Notable Changes
- <List of metrics with significant movement>
Print a summary to the user based on the mode that was run:
Quality Metrics Collected
=========================
Date: YYYY-MM-DD
Overall Score: XX/100
TypeScript: XX/100 PASS/WARN/FAIL
Lint: XX/100 PASS/WARN/FAIL
Tests: XX/100 PASS/WARN/FAIL
Build: XX/100 PASS/WARN/FAIL
Completeness: XX/100 PASS/WARN/FAIL
Snapshot: docs/metrics/YYYY-MM-DD.json
Dashboard: docs/metrics/dashboard.md
.cc-sessions/${SESSION_ID}.json: set status to completed.session_end to the operation log.tsc --noEmit and eslint are read-only analysis. The npm run build command may produce build artifacts; this is expected.docs/metrics/YYYY-MM-DD.json already exists, append a counter suffix (e.g., YYYY-MM-DD-2.json).null, record the error in details, and continue with the remaining collectors.collect mode first."null. Note "No test runner found (checked Vitest, Jest)."null. Note "No tsconfig.json found."null. Note "No build script in package.json."