Post-trade forensics and structural edge decay detection for intraday strategies. Compares expected vs realized slippage, hit rate, and net alpha. Monitors parameter drift, regime dependency stability, and fill rate deterioration. Detects microstructure regime change, edge crowding, and latency disadvantage emergence. Triggers strategy quarantine, risk scaling reduction, and hypothesis revalidation. Use when analyzing post-trade performance, diagnosing alpha decay, investigating execution degradation, auditing strategy health, or reasoning about edge longevity, crowding dynamics, or structural regime shifts.
No strategy runs on autopilot. This layer continuously validates that the structural edge a strategy exploits still exists, still converts to PnL after costs, and has not been arbitraged away or rendered obsolete by microstructure regime change. Every deployed strategy is guilty of decay until proven otherwise.
Inherits platform invariants 3 (evidence over intuition), 4 (decay is the default). Additionally:
Continuous comparison of live execution outcomes against backtest model predictions. Divergence is the primary decay signal.
Ownership boundary: The live-execution skill monitors slippage in real-time (rolling 20-trade windows) and triggers immediate safety responses (throttle, circuit breaker). This skill performs deeper forensic analysis over longer windows (50–200 trades), detects structural trends, and determines whether slippage drift indicates edge decay vs. transient execution degradation. The same distinction applies to fill rate monitoring.
expected_slippage = backtest_slippage_model(size, spread, volatility)
realized_slippage = fill_price - reference_price_at_signal_time
slippage_residual = realized_slippage - expected_slippage
| Metric | Window | Alert | Escalation |
|---|---|---|---|
| Mean slippage residual | Rolling 50 trades | > 1.5 bps | Investigate fill model calibration |
| Mean slippage residual | Rolling 200 trades | > 2.5 bps | Recalibrate fill model; reduce sizing |
| Slippage residual trend | 5-day OLS slope | Positive slope, p < 0.05 | Structural cost increase; re-evaluate edge |
| Slippage asymmetry | Entry vs exit split | Systematic one-side bias | Adverse selection investigation |
| Slippage by spread regime | Stratified by tight/normal/wide | Regime-dependent degradation | Update regime-conditional cost model |
expected_hit_rate = backtest_hit_rate(regime, signal_type, spread_bucket)
realized_hit_rate = winning_trades / total_trades (rolling window)
hit_rate_residual = realized_hit_rate - expected_hit_rate
| Metric | Window | Alert | Escalation |
|---|---|---|---|
| Hit rate residual | Rolling 100 trades | < -5pp | Warning; log |
| Hit rate residual | Rolling 100 trades | < -10pp | Reduce allocation 50% |
| Hit rate trend | 5-day OLS slope | Negative slope, p < 0.05 | Edge decay investigation |
| Hit rate by regime | Stratified by vol/spread regime | Regime-specific collapse | Regime dependency audit |
| Win/loss asymmetry shift | Avg win / avg loss ratio drift | > 20% relative change | Payoff structure degradation |
gross_alpha = raw_return - benchmark_return
net_alpha = gross_alpha - transaction_costs - slippage - market_impact
alpha_erosion = gross_alpha_backtest - net_alpha_live
| Metric | Condition | Action |
|---|---|---|
| Net alpha (rolling 5 days) | < 0 after costs | Warning; begin decay investigation |
| Net alpha (rolling 10 days) | < 0 after costs | Quarantine evaluation |
| Gross-to-net conversion ratio | < 0.5 (live) vs backtest ratio | Cost model recalibration |
| Alpha half-life | Decreasing across successive deployment periods | Structural decay confirmed |
| Alpha by time-of-day | Concentration in narrowing windows | Edge fragility increasing |
Continuous tracking of strategy internals for drift from calibrated baselines.
Track stability of optimized parameters relative to their calibration window.
| Parameter Class | Detection Method | Threshold |
|---|---|---|
| Signal coefficients | Rolling re-estimation vs deployed values | > 2 sigma shift from calibration mean |
| Optimal holding period | Rolling exit-timing analysis | > 30% change from calibrated value |
| Entry threshold | Rolling ROC curve, optimal cutoff drift | AUC degradation > 5% |
| Position sizing scalar | Realized vol vs assumed vol divergence | > 25% persistent divergence |
| Cost model parameters | Realized vs modeled cost distribution | KS test p < 0.05 |
Parameter drift detection protocol:
Ownership boundary: The risk-engine skill owns real-time regime detection and immediate risk responses to transitions. This skill audits whether regime classifications remain accurate over time and whether alpha is concentrated in, or collapsing across, regimes.
Verify that the strategy's performance is not collapsing into a single regime or losing effectiveness as regimes transition.
| Check | Method | Failure Condition |
|---|---|---|
| Cross-regime alpha | Stratify PnL by spread/vol/liquidity regime | Alpha concentrated in < 1 regime (was distributed) |
| Regime transition PnL | Measure PnL during regime transitions | Systematic losses on transitions (model lag) |
| Regime dwell time sensitivity | Compare PnL in short vs long regime episodes | Strategy requires unrealistic regime persistence |
| Regime frequency shift | Track regime transition rate vs historical | Transition rate change > 50% (market structure shift) |
| Regime misclassification rate | Compare predicted vs realized regime labels | Classification error > 20% |
expected_fill_rate = fill_model.predicted_rate(order_type, spread_regime, queue_model)
realized_fill_rate = fills / submissions (rolling window, by order type)
fill_rate_drift = realized_fill_rate - expected_fill_rate
| Metric | Window | Alert | Escalation |
|---|---|---|---|
| Passive fill rate drift | Rolling 200 orders | > -10% relative | Warning; log |
| Passive fill rate drift | Rolling 200 orders | > -20% relative | Shift to more aggressive order types |
| Aggressive fill rate drift | Rolling 200 orders | > -5% relative | Broker/venue investigation |
| Fill rate by time-of-day | Hourly buckets | Systematic degradation in key windows | Edge timing shift |
| Partial fill rate increase | Rolling 200 orders | > 30% relative increase | Liquidity withdrawal detection |
Identify environmental shifts that threaten edge viability at a structural level — beyond parameter drift.
Detect shifts in market microstructure that invalidate strategy assumptions.
| Signal | Observable | Detection |
|---|---|---|
| Spread regime shift | Median quoted spread (rolling 5 days) | > 30% change from calibration period |
| Quote update frequency change | Quotes per second distribution | KS test p < 0.01 vs calibration |
| Trade size distribution shift | Mean/median trade size | > 25% persistent change |
| Tick-to-trade ratio change | Quotes per trade | > 30% change from calibration |
| Venue composition shift | Proportion of trades by exchange | Herfindahl index change > 0.1 |
| Intraday volume profile shift | Volume-by-minute curve | Correlation with historical profile < 0.8 |
Microstructure change is the most dangerous decay vector because it invalidates the causal mechanism, not just the parameters.
Detect when the same structural edge is being exploited by competing participants, compressing returns.
| Symptom | Observable | Interpretation |
|---|---|---|
| Alpha decay with stable signal quality | Hit rate stable but profit-per-trade declining | Others trading same pattern; capturing spread faster |
| Adverse selection increase | More fills on losing trades; fewer on winning | Informed flow front-running your entry |
| Quote anticipation | NBBO moves against you between signal and fill more frequently | Faster participants reacting to same signal |
| Correlation with known factors | Strategy returns correlating with published microstructure factors | Academic/industry crowding |
| Entry timing compression | Profitable window after signal shrinking | Competing execution at same entry point |
| Execution shortfall growth | Implementation shortfall increasing while signal alpha stable | Speed disadvantage relative to crowd |
Crowding is confirmed when signal quality (pre-cost) remains stable but post-execution alpha erodes. If signal quality itself degrades, the mechanism may be structurally exhausted rather than crowded.
Detect when execution latency moves from irrelevant to alpha-destructive.
| Metric | Baseline | Alert |
|---|---|---|
| Signal-to-fill alpha decay curve | Alpha(t) function calibrated at deployment | Slope steepening > 2x baseline |
| Latency-stratified PnL | PnL binned by fill latency | Profitable only in fastest quintile |
| Market move during order flight | NBBO displacement between submit and fill | Systematic adverse displacement |
| Queue position deterioration | Inferred queue position at fill time | Consistently back-of-queue |
| Cancel-replace race losses | Modify attempts filled at stale price | Increasing rate of modification failures |
If alpha becomes latency-dependent when it was previously latency-insensitive, the edge has migrated to a speed game. This is a structural disqualification for L1-latency infrastructure.
Forensic findings feed into concrete interventions. No finding is informational-only — each maps to an action with defined thresholds.
Temporary removal from live capital. The strategy continues to receive signals and generate paper trades for comparison but executes no real orders.
| Trigger | Evidence Required | Duration |
|---|---|---|
| Net alpha < 0 for 10 consecutive trading days | PnL attribution showing cost > gross alpha | Until root cause identified and remediated |
| Hit rate collapse (< -15pp from expected) | Statistical significance, p < 0.01 | Until hit rate recovers on paper or hypothesis updated |
| Structural microstructure change detected | 2+ microstructure metrics past alert threshold | Until strategy re-validated on new regime data |
| Edge crowding confirmed | Crowding scorecard (3+ symptoms present) | Until differentiation re-established or strategy retired |
| Unexplained PnL divergence (live vs paper) | PnL compression ratio < 0.3 for 5 days | Until execution path audited and divergence explained |
Quarantine protocol:
Gradual reduction of capital allocation without full quarantine.
| Condition | Scaling Action |
|---|---|
| 1 decay metric at alert level | Reduce to 75% allocation |
| 2 decay metrics at alert level | Reduce to 50% allocation |
| 3+ decay metrics at alert level | Reduce to 25% allocation |
| Any metric at escalation level | Reduce to 25% or quarantine |
| Slippage + hit rate + fill rate all degraded | Quarantine (triple failure) |
Scaling changes are:
When forensic evidence challenges the original strategy hypothesis, trigger a structured re-evaluation.
| Revalidation Trigger | Required Analysis |
|---|---|
| Alpha source shift (time-of-day, regime) | Re-run research protocol on recent data; compare to original |
| Microstructure mechanism change | Re-derive signal from first principles on current market structure |
| Parameter drift beyond 2-sigma | Re-optimize on walk-forward window; compare to deployed params |
| Crowding confirmed | Assess whether differentiation possible; if not, retire |
| Cost structure change | Re-evaluate minimum alpha threshold; sensitivity analysis |
Revalidation follows the research protocol from microstructure-alpha:
Generated daily and on-demand. Contains:
{
"strategy_id": str,
"report_date": date,
"deployment_age_days": int,
"health_status": "healthy" | "warning" | "degraded" | "quarantined",
"compare": {
"slippage": { expected, realized, residual, trend, p_value },
"hit_rate": { expected, realized, residual, trend, p_value },
"alpha": { gross, net, conversion_ratio, half_life_estimate }
},
"monitor": {
"parameter_drift": { param: { deployed, current, sigma_shift } },
"regime_stability": { regime: alpha_contribution },
"fill_rate": { expected, realized, drift, by_order_type }
},
"detect": {
"microstructure_change": { metric: { baseline, current, alert_level } },
"crowding_score": { symptom_count, symptoms_present[], confidence },
"latency_disadvantage": { alpha_decay_slope, latency_pnl_correlation }
},
"trigger": {
"active_interventions": [],
"scaling_level": float,
"quarantine_status": bool,
"revalidation_pending": bool
}
}
Maintain a longitudinal record of edge quality for each strategy:
This timeline is the primary artifact for strategy lifecycle decisions (scale, maintain, reduce, retire).
Forensic analysis consumes data from existing typed artifacts:
| Source | Type | Location |
|---|---|---|
| Trade lifecycle records | TradeRecord | storage/trade_journal.py — TradeJournal.query() |
| Position changes | PositionUpdate event | core/events.py — published on bus at M9 |
| Execution acks | OrderAck with OrderAckStatus | core/events.py — fill_price, filled_quantity |
| Risk decisions | RiskVerdict with RiskAction | core/events.py — published at M5 and M6 |
| System state changes | StateTransition event | core/events.py — SM audit trail |
TradeRecord carries the full decision chain: order_id, symbol,
strategy_id, side, requested_quantity, filled_quantity,
fill_price, signal_timestamp_ns, submit_timestamp_ns,
fill_timestamp_ns, slippage_bps, fees, realized_pnl,
and correlation_id — linking each trade to the signal that caused it.
Forensic findings are delivered via Alert events (core/events.py)
with AlertSeverity levels. The AlertManager protocol
(monitoring/alerting.py) routes them based on severity.
The following forensic-specific event types are NOT YET IMPLEMENTED.
When built, they must extend the Event base class (core/events.py)
to inherit timestamp_ns, correlation_id, and sequence provenance:
| Future Event | Payload |
|---|---|
FORENSIC_ALERT | strategy_id, metric, category, current_value, threshold, severity |
DECAY_DETECTED | strategy_id, decay_type, evidence, confidence, recommended_action |
QUARANTINE_INITIATED | strategy_id, trigger_reasons[], positions_flattened, paper_mode_active |
QUARANTINE_LIFTED | strategy_id, revalidation_evidence, new_parameters |
SCALING_ADJUSTED | strategy_id, old_level, new_level, triggering_metrics[] |
REVALIDATION_REQUESTED | strategy_id, trigger, original_hypothesis, required_analysis[] |
HEALTH_REPORT | strategy_id, full_report_payload |
STRATEGY_RETIRED | strategy_id, retirement_reason, forensic_record_id |
Every event carries a timestamp from the injectable Clock protocol
(never raw datetime.now()).
| Failure | Detection | Response |
|---|---|---|
| Stale forensic data (no new trades) | Trade count below minimum per window | Use wider window; flag low-sample alert |
| Backtest baseline outdated | Calibration date > configured max age | Force recalibration before next assessment |
| False positive decay signal | Single metric spike without supporting evidence | Require 2+ corroborating metrics for escalation |
| False negative (missed decay) | Post-mortem reveals undetected degradation | Add detection rule; tighten thresholds |
| Forensic engine unavailable | Heartbeat monitor | Continue trading with last-known health status; alert ops |
| Regime classifier disagreement | Forensic vs risk engine regime labels diverge | Use more conservative classification; alert |
| Dependency | Interface |
|---|---|
| Live Execution (live-execution skill) | OrderAck with OrderAckStatus for fill analysis; MetricEvent for latency |
| Risk Engine (risk-engine skill) | RiskVerdict/RiskAction for constraint context; RiskLevel SM state |
| Backtest Engine (backtest-engine skill) | TradeRecord baselines from TradeJournal.query() |
| Microstructure Alpha (microstructure-alpha skill) | Signal schema for hypothesis revalidation; FeatureVector quality flags |
| Testing & Validation (testing-validation skill) | Sim-vs-live divergence metrics; promotion/demotion pipeline |
| Data Engineering (data-engineering skill) | NBBOQuote events via EventLog.replay() for historical analysis |
The forensic layer sits downstream of execution and upstream of strategy
lifecycle decisions. It consumes TradeRecord entries and Alert events,
compares against backtest baselines, and emits forensic alerts via the
Alert/AlertSeverity mechanism that feed into risk scaling and strategy
promotion/demotion decisions.