Generates complete two-sample Mendelian randomization (MR) research designs from a user-provided research direction. Use when users want to design, plan, or build a study using two-sample MR to test causal relationships. Triggers: "design a two-sample MR study", "build a publishable MR paper", "test whether this biomarker causally affects this disease", "generate Lite/Standard/Advanced MR plans", "screen multiple exposures with MR", "bidirectional MR design", "causal inference using GWAS summary statistics", or "I want to study X and Y using MR". Always outputs four workload configurations (Lite / Standard / Advanced / Publication+) with a recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, and publication upgrade path.
Generates a complete two-sample MR study design from a user-provided research direction. Always outputs four workload configurations and a recommended primary plan.
| Style | Description | Example |
|---|---|---|
| A. Single Exposure → Single Outcome | One biomarker or trait to one disease | Serum uric acid → gout; vitamin D → osteoporosis |
| B. Multi-Exposure Screening | Panel of exposures to one outcome | Dietary factors → endometriosis; cytokine panel → RA |
| C. Bidirectional MR | Reciprocal causal testing | Inflammation ↔ depression; BMI ↔ osteoarthritis |
| D. Lifestyle / Diet / Behavioral | Self-reported behavioral exposures | Coffee intake → hypertension; sleep duration → stroke |
| E. Biomarker / Molecular Trait | Circulating proteins, metabolites | Cytokines → autoimmune disease; plasma proteins → Alzheimer's |
| F. Publication-Oriented | Comprehensive sensitivity-rich design | Full estimator suite with complete figure set |
Identify:
Always generate all four. For each configuration describe: goal, required data, major modules, expected workload, figure set, strengths, and weaknesses.
| Config | Goal | Timeframe | Best For |
|---|---|---|---|
| Lite | Fast minimal causal test | 2–4 weeks | Quick launch, 1 exposure × 1 outcome |
| Standard | Publication-ready core MR | 4–8 weeks | Single or small panel + sensitivity suite |
| Advanced | Robust multi-extension design | 8–14 weeks | Bidirectional, MVMR, replication GWAS |
| Publication+ | High-impact comprehensive paper | 12–20 weeks | Full sensitivity, MVMR, colocalization, power |
Select the best-fit configuration and explain why, given the exposure type, outcome, and any stated user constraints (time, data access, publication goal).
For each step include: step name, purpose, input, method, key parameters/thresholds, expected output, failure points, and alternative approaches.
Core modules to address when relevant:
Exposure-class IV count benchmarks — state expected IV count and flag weak-instrument risk accordingly:
→ Full benchmarks by exposure class: references/iv_benchmarks.md
GWAS data sources by exposure class:
→ Recommended databases and last-verified dates: references/gwas_databases.md
Fault tolerance guidelines:
Always list:
State what each layer proves and what it does not prove. Distinguish:
Always include a self-critical section addressing:
Slim version using only publicly available GWAS: 1 exposure (or small set), 1 outcome, IVW + 1–2 complementary estimators, heterogeneity/pleiotropy/leave-one-out, concise interpretation. Confirm this fits within any stated time constraints before recommending.
Explain what to add beyond Standard, which additions most improve publication strength, and which modules add rigor versus complexity. For molecular trait MR (proteins, metabolites), always include colocalization as a required upgrade for high-impact journals.
When providing R code examples or frameworks:
TwoSampleMR package (CRAN) as the primary tool# EXAMPLE ID — replace with your target phenotype IDStandard R framework template:
library(TwoSampleMR)
library(MRPRESSO)
# Step 1: Extract instruments for exposure
# EXAMPLE ID below — replace with your target exposure GWAS ID
exposure <- extract_instruments(outcomes = "ukb-b-XXXXX") # EXAMPLE ID
# Step 2: Extract outcome data
# EXAMPLE ID below — replace with your target outcome GWAS ID
outcome <- extract_outcome_data(
snps = exposure$SNP,
outcomes = "ieu-b-XXXXX" # EXAMPLE ID
)
# Step 3: Harmonise
harmonized <- harmonise_data(exposure, outcome)
# Step 4: Primary and sensitivity analyses
res <- mr(harmonized, method_list = c(
"mr_ivw",
"mr_egger_regression",
"mr_weighted_median",
"mr_weighted_mode"
))
# Step 5: Heterogeneity and pleiotropy
het <- mr_heterogeneity(harmonized)
plt <- mr_pleiotropy_test(harmonized)
loo <- mr_leaveoneout(harmonized)
To find valid GWAS IDs: ao <- available_outcomes(); View(ao)
This skill accepts: a research direction involving a causal question between an exposure (biomarker, dietary factor, behavioral trait, molecular trait, or disease) and an outcome, where the user wants to design a two-sample Mendelian randomization study.
If the user's request does not involve MR study design — for example, asking to design an RCT, conduct a systematic review, write a manuscript introduction, perform propensity score analysis, or answer a general epidemiology question — do not proceed with the MR planning workflow. Instead respond:
"Two-Sample MR Research Planner is designed to generate Mendelian randomization study designs using GWAS summary statistics. Your request appears to be outside this scope. Please provide an exposure–outcome pair you want to test using MR, or use a more appropriate skill for your task (e.g., a systematic review skill for literature synthesis, or an experimental design skill for RCTs)."
| File | Content | Used In |
|---|---|---|
| references/gwas_databases.md | Recommended GWAS sources by exposure class with last-verified dates | Step 4 — GWAS selection |
| references/iv_benchmarks.md | Typical IV count ranges and weak-instrument risk flags by exposure class | Step 4 — instrument extraction |