Assess and interpret between-study heterogeneity in meta-analysis using I², Q statistic, tau², and prediction intervals. Use when users need to evaluate consistency across studies, understand sources of variation, or decide if pooling is appropriate.
This skill teaches assessment and interpretation of between-study heterogeneity, a critical component of meta-analysis quality.
Heterogeneity refers to variation in true effects across studies beyond what we'd expect from sampling error alone. High heterogeneity questions whether pooling is meaningful.
Activate this skill when users:
What it is: Tests null hypothesis that all studies share a common effect.
Interpretation:
Limitation: Underpowered with few studies, overpowered with many.
What it is: Percentage of variability due to heterogeneity rather than chance.
Interpretation Guidelines (Cochrane):
| I² Value | Interpretation |
|---|---|
| 0-40% | Might not be important |
| 30-60% | May represent moderate heterogeneity |
| 50-90% | May represent substantial heterogeneity |
| 75-100% | Considerable heterogeneity |
Key Teaching Points:
Socratic Questions:
What it is: Estimated variance of true effects across studies.
Interpretation:
Advantage: Absolute measure, unlike I² which is relative.
What it is: Range where we expect the true effect of a NEW study to fall.
Why it matters:
Example:
Pooled effect: OR = 0.70, 95% CI [0.55, 0.89]
Prediction interval: [0.35, 1.40]
Interpretation: While the average effect favors treatment,
a new study might find effects ranging from strongly
beneficial (0.35) to slightly harmful (1.40).
library(metafor)
# Fit random-effects model
res <- rma(yi = yi, sei = sei, data = dat, method = "REML")
# View heterogeneity statistics
print(res)
# Look for: tau², I², H², Q, p-value
# Extract specific values
res$tau2 # tau-squared
res$I2 # I-squared (as proportion)
res$QE # Q statistic
res$QEp # p-value for Q test
# Get confidence interval for I²
confint(res)
# Output includes:
# estimate ci.lb ci.ub
# tau^2 0.0234 0.0012 0.1456
# I^2(%) 62.4000 12.3000 89.2000
# Calculate prediction interval
predict(res)
# Or manually:
pi_lower <- res$beta - qt(0.975, res$k-2) * sqrt(res$tau2 + res$se^2)
pi_upper <- res$beta + qt(0.975, res$k-2) * sqrt(res$tau2 + res$se^2)
# Forest plot with prediction interval
forest(res,
slab = dat$study,
addpred = TRUE, # Adds prediction interval
header = TRUE)
# Baujat plot (identifies outliers)
baujat(res)
# GOSH plot (sensitivity to study inclusion)
gosh_res <- gosh(res)
plot(gosh_res)
"Let's look at your heterogeneity results:
"This suggests substantial heterogeneity. About 67% of the variation we see is due to real differences between studies, not just chance."
"With this level of heterogeneity, we should:
"Let's think about what might cause these differences:
I² Assessment
│
├── I² < 40%
│ └── Heterogeneity likely unimportant
│ → Proceed with pooled estimate
│
├── I² 40-75%
│ └── Moderate heterogeneity
│ → Report pooled estimate
│ → Explore sources (subgroups)
│ → Report prediction interval
│
└── I² > 75%
└── Substantial heterogeneity
→ Question if pooling is meaningful
→ Mandatory exploration of sources
→ Consider narrative synthesis
→ Always report prediction interval
"High I² means we can't do meta-analysis"
"Non-significant Q means no heterogeneity"
"I² tells us about clinical importance"
Basic: "What does I² = 50% mean?"
Intermediate: "Q test is non-significant but I² = 45%. How do you interpret this?"
Advanced: "Pooled OR = 0.6 [0.4, 0.9] but prediction interval is [0.3, 1.2]. What's the clinical implication?"
meta-analysis-fundamentals - Understanding pooled effectsforest-plot-creation - Visualizing heterogeneitypublication-bias-detection - Another source of concernGlass (the teaching agent) MUST adapt this content to the learner:
Example Adaptations: