A comprehensive two-phase skill for designing multi-LLM research strategies (Phase 1) and consolidating multi-model outputs into actionable intelligence (Phase 2).
1. Purpose
This skill provides 9 core capabilities:
#
Capability
Phase
Description
1
Decompose
1
Break research questions into MECE structures
2
Assign
1
Map question categories to optimal LLMs
3
Assess
1
Evaluate research risks at appropriate depth
4
Generate
1
Produce model-specific optimized prompts
5
Consolidate
2
Synthesize multi-model outputs into unified findings
6
Resolve
Skills relacionados
2
Handle conflicting information with WWHTBT protocol
7
Classify
2
Score evidence quality and tag uncertainty types
8
Detect
2
Identify coverage gaps and unknown unknowns
9
Produce
2
Generate tiered, decision-ready research reports
Checkpoints
This skill uses interactive checkpoints (see references/checkpoints.yaml) to resolve ambiguity:
research_type_classification — When research type is ambiguous
risk_depth_selection — When risk assessment depth not specified
model_mode_selection — When model execution mode not specified
hypothesis_priors_required — When multi_hypothesis enabled but priors missing
conflict_resolution_approach — When model outputs have significant conflicts (Phase 2)
Factual lookup, comprehensive sourcing, current data
Less depth on complex reasoning
GPT-5.2 Deep
Recency, depth, exhaustiveness
Technical details, narrow deep-dives, edge cases
Can miss broader context
Default Category Assignments
Research Type
Claude
Gemini
GPT
Market
Demand, Trends
Size, Structure, Supply
—
Competitive
Positioning, Strategy
Product, GTM, Org
Deep Dive
Technology
Fit, Risk
Maturity, Cost
Capability
Strategic
Options, Stakeholders
Environment
Implementation
5. Risk Assessment Depths
Quick (5 Factors)
Basic risk identification for time-sensitive research:
Top 3 risks with likelihood/impact
No mitigations or scenarios
Standard (+ Bias Audit)
Adds mitigation planning and cognitive bias check:
Mitigations and contingencies per risk
Early warning signals
Bias audit: confirmation, availability, anchoring
Comprehensive (+ Base Rates)
Full risk analysis with historical grounding:
Risk scenarios with trigger conditions
Risk dependencies and cascades
Base rate comparison from similar research
Pre-mortem analysis
6. MECE Decomposition Patterns
Pattern 1: Market Research
Category
Focus
Model
Market Size & Dynamics
TAM/SAM/SOM, growth rates
Gemini
Market Structure
Segmentation, value chain
Gemini
Demand Characteristics
Buyers, use cases, criteria
Claude
Supply & Competition
Players, barriers, substitutes
Gemini
Market Evolution
Trends, regulatory, disruption
Claude
Pattern 2: Competitive Intelligence
Category
Focus
Model
Product & Offering
Features, pricing, roadmap
GPT
Customers & Positioning
Segments, win/loss, messaging
Claude
Go-to-Market
Sales, marketing, partnerships
Gemini
Organization & Operations
Team, tech stack, cost structure
Gemini
Strategy & Trajectory
Direction, investments, SWOT
Claude
Pattern 3: Technology Evaluation
Category
Focus
Model
Capability & Performance
Features, benchmarks, limits
GPT
Maturity & Ecosystem
Stability, community, tools
Gemini
Fit & Integration
Use case alignment, migration
Claude
Cost & Investment
TCO, licensing, infrastructure
Gemini
Risk & Governance
Technical, vendor, compliance
Claude
Pattern 4: Strategic Research
Category
Focus
Model
Current State
Position, strengths, weaknesses
Claude
External Environment
Industry, macro, technology
Gemini
Strategic Options
Directions, trade-offs, requirements
Claude
Stakeholder Considerations
Customer, competitor, employee
Claude
Implementation Requirements
Capabilities, investments, timeline
GPT
7. Multi-Hypothesis Framing
When to Enable
Testing predictions or forecasts
Evaluating competing theories
Decision involves binary or multi-way choice
Need to avoid confirmation bias
Process
Define core question as testable prediction
Generate 2-4 MECE hypotheses covering all outcomes
Assign prior probabilities (must sum to 100%)
Define supporting and refuting evidence for each
Research gathers evidence against criteria
Update posteriors based on evidence strength
Example
<hypotheses question="Will enterprise adopt GenAI for customer service by 2027?">
<hypothesis id="H1" position="broad" prior="30%">
>50% enterprise adoption
</hypothesis>
<hypothesis id="H2" position="selective" prior="50%">
10-50% adoption in specific use cases
</hypothesis>
<hypothesis id="H3" position="limited" prior="20%">
<10% adoption due to barriers
</hypothesis>
</hypotheses>
8. Evidence Strength Tribunal
5-point scale for evaluating source quality:
Score
Name
Definition
Examples
5
Primary
Direct from entity being researched
SEC filings, earnings calls, official docs
4
Auth. Secondary
Major analysts with citations
Gartner, Forrester, WSJ investigative
3
Credible Secondary
Reputable sources, some sourcing
TechCrunch, industry publications
2
Weak Secondary
Unsourced, outdated, anonymous
LinkedIn self-reports, old reports
1
Speculative
No verifiable basis
Rumors, predictions, fabrications
Time Decay: Apply -1 for technology data >6 months, market data >1 year.
Reference: See references/evidence-strength-rubric.md for full scoring guidelines.
9. Conflict Resolution: WWHTBT
When models or sources disagree and resolution isn't clear, apply What Would Have To Be True analysis:
<conflict claim="Market size for X">
<position holder="Gartner" value="$50B">
<evidence score="4">2024 market report with methodology</evidence>
</position>
<position holder="IDC" value="$35B">
<evidence score="4">Different scope definition</evidence>
</position>
<wwhtbt>
<for_gartner>
<condition>Adjacent markets included in scope</condition>
<condition>Projected vs. realized revenue counted</condition>
</for_gartner>
<for_idc>
<condition>Only core product category</condition>
<condition>Realized revenue only</condition>
</for_idc>
</wwhtbt>
<recommendation>
Report range ($35-50B) with scope dependency noted.
For our purposes, IDC definition more aligned.
</recommendation>
</conflict>
10. Uncertainty Decomposition
Type
Definition
Can Reduce?
Action
Epistemic
Knowledge gaps that COULD be closed
YES
Research further
Aleatory
Inherent randomness that CANNOT be predicted
NO
Quantify range, build scenarios
Model
Framework/definition dependencies
DEPENDS
Make choices explicit
Classification Questions
Epistemic: "Does someone, somewhere know this?"
Aleatory: "Even with perfect info, would this still be uncertain?"
Model: "Would a different definition change the answer?"
Reference: See references/uncertainty-taxonomy.md for full classification protocol.
11. Gap Analysis
Part 1: MECE Coverage Audit
Compare findings against expected coverage matrix for research type. Flag:
Critical gaps: Core dimensions missing or Score ≤2
Significant gaps: Supporting dimensions weak
Minor gaps: Context items missing
Part 2: Unknown Unknowns Probes
Probe
Question
Adjacent Domain
What lessons from related industries apply?
Stakeholder Blind Spot
Whose voice is missing from sources?
Time Horizon
What historical precedents or future implications are ignored?
Failure Mode
What would have to be true for conclusions to be wrong?
Second-Order Effects
If findings are true, what else must follow?
Reference: See references/gap-analysis-protocol.md for full audit process.