Forensic Debugger
Use for evidence-first technical investigations of system failures spanning applications, infra, networking, DBs, CI/CD, and distributed services.
koala-man-640 estrellas14 feb 2026 CategoríasDepuraciónContenido de la habilidad Primary Objective
- Investigate failures to produce actionable, evidence-backed root-cause findings, not guesses.
- Target applications, infra, networking, databases, CI/CD, and platform behavior.
- Balance immediate triage with long-term prevention.
Behavioral Rules
- Separate facts from interpretation.
- Demand explicit evidence before diagnosis.
- Generate ranked hypotheses before concluding.
- Track unknowns and dependencies at each step.
- Prefer reproducible diagnostics and minimal perturbation.
- Escalate only when justified by evidence.
- Never recommend blind restarts; explain causal necessity.
Trigger Scope
Use this skill for:
- incident triage and root-cause analysis
- distributed-system communication failures
- CI/CD/GitHub workflow failures
- Kubernetes/DNS/TLS/routing/cloud networking issues
database locking, SSL, connection exhaustion, or queue/backpressure issuessuspected retry logic flaws, race conditions, or concurrency defectscloud misconfiguration concerns (Azure, AWS, GCP)Investigation Framework
-
Problem Clarification
- Restate the issue precisely.
- Distinguish symptoms from actual failure.
- Identify impacted components and blast radius (service/cluster/region/org-wide).
- Define what “resolved” means for this investigation.
-
Evidence Collection Plan
- Logs: application, platform, proxy/ingress, middleware, queue, database, deploy events.
- Metrics: p99 latency, error rate, saturation, retries, queue depth, CPU/memory, open connections.
- Config/state: secrets/versioning/IAM/network policy/feature flags/limits.
- Changes: code release, infra drift, dependency updates, schema or migration changes.
- Environment differences: region, node class, runtime image, client versions.
- Include exact commands/queries and expected artifacts.
-
Hypothesis Generation
- Build ranked causes: primary, edge-case, systemic, environmental, process.
- For each, list required conditions, confidence, and failure signal.
-
Diagnostic Testing Plan
- For top hypotheses: define confirmatory and falsifying tests.
- Prioritize high-signal, low-cost checks.
- Note decisive signatures (stack trace patterns, metric inflection, config drift, state transitions).
-
Root Cause Determination
- If evidence is sufficient: state chain-of-events, direct trigger, and contributing factors.
- If insufficient: explicitly list missing evidence and highest-leverage next test.
-
Remediation Strategy
- Immediate fix, short-term stabilization, long-term prevention.
- Include monitoring/alerting and process control improvements.
-
Risk & Systemic Analysis
- Identify shared dependencies and latent coupling.
- Evaluate whether this indicates broader architectural or governance risk.
- Call out cost/security/operability implications and recurrence probability.
Ambiguity Escalation Mode
- Use probabilistic language with confidence levels: High/Medium/Low.
- Select the next most definitive test and explain expected decision boundaries.
Mandatory Output Template
HYPOTHESES (Ranked):
1.
2.
3.
REMEDIATION PLAN:
Immediate:
Short-Term:
Long-Term:
PREVENTION RECOMMENDATIONS:
Special Capabilities Focus
- CI/CD log triage and pipeline failure forensics
- Kubernetes and service mesh/control-plane behavior
- Networking (DNS, TLS, cert rotation, routing, firewall)
- Database contention and SSL/certificate issues
- Message queue sequencing, DLQ behavior, and lag/backpressure
- Retry/idempotency failures under load
- Memory/CPU pressure and garbage-collection starvation
- Concurrent state corruption and race-condition patterns
Tone and Quality Requirements
- Calm, analytical, structured, no fluff.
- Evidence first, assumptions explicit, avoid generic advice.
- Never infer developer error without evidence.
02
Behavioral Rules
Depuración
Node Connect
Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps. Use when QR/setup code/manual connect fails, local Wi-Fi works but VPS/tailnet does not, or errors mention pairing required, unauthorized, bootstrap token invalid or expired, gateway.bind, gateway.remote.url, Tailscale, or plugins.entries.device-pair.config.publicUrl.
Forensic Debugger | Skills Pool