Improve forecast accuracy by following Philip Tetlock's 10 evidence-based principles from superforecaster research distinguishing top 2% predictors
Philip Tetlock's 20-year Good Judgment Project identified "superforecasters"—the top 2% of predictors who consistently beat experts, pundits, and prediction markets. His 10 Commandments distill their practices into actionable principles: think probabilistically, update incrementally, beware biases, aggregate perspectives, track accuracy, embrace uncertainty.
Don't waste time on unpredictable questions ("Will aliens visit?") or trivially predictable ones ("Will sun rise tomorrow?"). Focus on questions in the "Goldilocks zone": difficult but not impossible.
Example: Skip: "Bitcoin price in 2030" (too noisy). Focus: "Will Bitcoin ETF be approved by SEC within 18 months?" (resolvable, valuable).
Decompose vague questions into answerable sub-questions. Aggregate sub-forecasts to avoid "holistic" guessing.
Example: "Will fusion energy be commercially viable by 2040?" → Break into: "Will ITER succeed?" "Will private fusion reach breakeven?" "Will costs drop below $0.10/kWh?" "Will regulations allow deployment?"
Start with base rate (outside view): "What happened in similar cases?" Then adjust for unique details (inside view). Don't skip base rate, don't ignore specifics.
Example: "Will our startup succeed?" Outside view: 90% startups fail. Inside view: Strong team, proven traction. Adjusted forecast: 30% success (better than base, but not ignoring statistics).
Update beliefs when new evidence arrives, but don't whipsaw. Bayesian updating: Weight new evidence proportional to its quality and relevance.
Example: Forecast: 60% chance of product launch success. New evidence: One beta tester complains. Don't drop to 20% (overreacting). Update to 55% (slight adjustment for weak signal).
Map competing forces pushing in opposite directions. Strong forecasters consider both bullish and bearish factors simultaneously.
Example: EV adoption forecast: Bullish forces (battery costs falling, climate policy). Bearish forces (charging infrastructure gaps, consumer range anxiety). Weigh both, don't cherry-pick.
Avoid round numbers (50%, 75%). Use granular probabilities (63%, 72%) when evidence supports it. Precision forces you to process information deeply.
Example: Don't say "probably" or "75%." Say "67% confident based on 3 comparable precedents showing 2/3 success rate."
Humans are overconfident on hard questions, underconfident on easy ones. Calibrate: Track "70% forecasts"—do they happen 70% of the time?
Example: "90% sure this feature ships on time." Track 50 such forecasts. Only 60% shipped on time. You're overconfident. Recalibrate "90%" to "60%."
When wrong, diagnose: Was it poor process (didn't check base rate), bad luck (correctly assessed 80% odds, hit the 20%), or missing information?
Example: Predicted 80% chance competitor wouldn't launch in Q1. They did. Diagnosis: Missed insider information (process error), not bad luck. Fix: Build better intelligence gathering.
Form diverse teams. Encourage dissent. Reward accuracy over ego. Practice intellectual humility.
Example: Amazon's "disagree and commit"—require contrary views before alignment. Surfaces blind spots.
Iterate: Forecast → Track → Analyze errors → Adjust process → Repeat. Superforecasters spend 50% more time than average forecasters on post-mortems.
Example: Quarterly review: "I forecasted 8 events at 70%, only 4 happened (50%). I'm overconfident. Next quarter, my 70% threshold requires stronger evidence."
Situation: Product manager forecasting whether new feature will increase retention >10% in 3 months.
Application:
Outcome: Feature increased retention 12%. Forecast was calibrated (55% for a yes, outcome was yes, though close). Process validated. Improve calibration over 100 forecasts.