Use when a change must be production-diagnosable: logs/metrics/traces, correlation IDs, golden signals, and runbook-grade troubleshooting. Produces a structured ObservabilityReport artifact with objective evidence. Prefer OpenTelemetry semantics and avoid vendor lock-in. Keep scope minimal.
Use this skill when the change includes any of:
If none apply, do a light pass: ensure key errors are logged safely and add minimal troubleshooting notes.
/observability-engineerdocs/agents/task-spec.md (or your repo’s equivalent) — acceptance criteria + validation plandocs/agents/patch-report.md — what changed / hotspotsdocs/agents/test-report.md — what was run and how to reproduceYou deliver runbook-grade observability for the change:
trace_id / span_id (or a single correlation ID if tracing is unavailable)Record findings in the report.
For each AC:
Keep mappings small and actionable.
Preferred order:
Guidelines:
In the ObservabilityReport, propose:
For each likely symptom:
status: BLOCKED and document what’s missing.Classify blockers as:
missing_runtime_context (cannot run or trigger the code path)tooling_gap (no metrics/tracing lib available and adding one is out-of-scope)sensitive_data_risk (telemetry might leak secrets/PII unless design changes)high_cardinality_risk (metrics/log labels would explode)governance_block (policies prevent required instrumentation)Retry budget (default):
docs/agents/observability-report.mdObservabilityReport in a fenced code blockUse the template at:
./templates/observability-report.template.mdSet:
status: READY_FOR_QUALITY_GATE only if you have objective evidence and safe, low-cardinality signalsstatus: BLOCKED with specific evidence + next step