Run a structured incident post-mortem using the Incident Response Commander methodology. Use when a production incident, bug, or operational failure occurs in the Hermes trading system. Produces severity classification, full timeline, 5 Whys, root cause analysis, action items, and lessons learned.
SEV1: Full outage, data loss risk, security breach — <5 min response
SEV2: Financial discrepancy, degraded service, key feature down — <15 min
SEV3: Minor feature broken, workaround available — <1 hour
SEV4: Cosmetic issue — next business day
Before calling this skill, collect as much of the following as available:
- Exact timestamps (UTC) of: discovery, first symptom, root cause trigger
- Financial impact: PnL, fees, position sizes
- Systems affected: which DBs, exchanges, tokens
- What was observed vs. what was expected
- What actions were taken to resolve
- Any pre-existing issues that contributed
Use skill: incident-report
Provide: all gathered incident data as context
/root/.hermes/reports/If the skill system is down, use this markdown template directly:
# POST-MORTEM: [Incident Title]
**Incident ID:** INC-[YEAR]-[###]
**Date:** YYYY-MM-DD
**Severity:** SEV[1-4]
**Status:** [Open/Resolved]
**Author:** [Name]
## Executive Summary
[2-3 sentences: what happened, impact, resolution]
## Timeline (UTC)
| Time | Event |
|------|-------|
| HH:MM | [First symptom detected] |
| HH:MM | [Root cause identified] |
| HH:MM | [Mitigation applied] |
| HH:MM | [Resolved] |
## Impact
- Financial: [$X lost / $0]
- Data Integrity: [description]
- Decision Quality: [description]
- Duration: [X hours]
## 5 Whys
1. Why did [symptom]? → [answer]
2. Why did [answer 1]? → [answer]
3. Why did [answer 2]? → [answer]
4. Why did [answer 3]? → [answer]
5. Why did [answer 4]? → [root systemic issue]
## Root Cause
[Technical explanation of failure chain]
## Action Items
| ID | Action | Owner | Priority | Due | Status |
|----|--------|-------|----------|-----|--------|
| 1 | ... | TBD | P1 | ... | Open |
## Lessons Learned
1. [Key takeaway]
2. [Key takeaway]
get_open_positions() returns fewer positions than HL shows → phantomcompact_rounds counter always 0 for a token that should be compacting → field name buglast_move_at keeps resetting without price progress → stale timer design issueAll incident reports go to: /root/.hermes/reports/
Naming convention: INC-[YEAR]-[###]-[short-title].md