Extract and prepare study data for meta-analysis including effect size calculation, variance estimation, and handling missing data. Use when users need to convert reported statistics into analyzable format or calculate effect sizes from raw data.
This skill teaches how to extract, convert, and prepare study data for meta-analysis.
Before running a meta-analysis, you need to extract effect sizes and their variances from each study. Studies report results in different formats, requiring conversion to a common metric.
Activate this skill when users:
| Outcome Type | Required Data |
|---|
| Binary | Events and totals for each group, OR 2x2 table |
| Continuous | Means, SDs, and sample sizes for each group |
| Correlation | Correlation coefficient (r) and sample size |
| Pre-calculated | Effect size and SE (or CI or variance) |
Treatment Control
Event a b
No Event c d
Total n1 n2
Odds Ratio:
OR = (a/c) / (b/d) = (a*d) / (b*c)
log_OR = log(OR)
SE_log_OR = sqrt(1/a + 1/b + 1/c + 1/d)
Risk Ratio:
RR = (a/n1) / (b/n2)
log_RR = log(RR)
SE_log_RR = sqrt(1/a - 1/n1 + 1/b - 1/n2)
Risk Difference:
RD = (a/n1) - (b/n2)
SE_RD = sqrt((a*c/n1^3) + (b*d/n2^3))
# Pooled SD
s_pooled = sqrt(((n1-1)*sd1^2 + (n2-1)*sd2^2) / (n1+n2-2))
# Cohen's d
d = (mean1 - mean2) / s_pooled
# Hedges' g (bias-corrected)
J = 1 - (3 / (4*(n1+n2-2) - 1))
g = J * d
# Variance
var_g = (n1+n2)/(n1*n2) + g^2/(2*(n1+n2))
MD = mean1 - mean2
SE_MD = sqrt(sd1^2/n1 + sd2^2/n2)
library(metafor)
# Binary outcomes - Odds Ratio
dat_binary <- escalc(measure = "OR",
ai = events_treat, bi = nonevents_treat,
ci = events_ctrl, di = nonevents_ctrl,
data = mydata)
# Binary outcomes - Risk Ratio
dat_rr <- escalc(measure = "RR",
ai = events_treat, bi = nonevents_treat,
ci = events_ctrl, di = nonevents_ctrl,
data = mydata)
# Continuous outcomes - SMD (Hedges' g)
dat_smd <- escalc(measure = "SMD",
m1i = mean_treat, sd1i = sd_treat, n1i = n_treat,
m2i = mean_ctrl, sd2i = sd_ctrl, n2i = n_ctrl,
data = mydata)
# Continuous outcomes - Mean Difference
dat_md <- escalc(measure = "MD",
m1i = mean_treat, sd1i = sd_treat, n1i = n_treat,
m2i = mean_ctrl, sd2i = sd_ctrl, n2i = n_ctrl,
data = mydata)
# Correlations
dat_cor <- escalc(measure = "ZCOR", # Fisher's z
ri = correlation, ni = sample_size,
data = mydata)
# From OR and 95% CI
log_or <- log(OR)
se_log_or <- (log(CI_upper) - log(CI_lower)) / (2 * 1.96)
# From SMD and 95% CI
se_smd <- (CI_upper - CI_lower) / (2 * 1.96)
# From p-value and sample size (approximate)
# For t-test
t_value <- qt(1 - p_value/2, df = n1 + n2 - 2)
d <- t_value * sqrt(1/n1 + 1/n2)
Option 1: Impute from other studies
# Use median SD from studies that report it
median_sd <- median(dat$sd, na.rm = TRUE)
dat$sd[is.na(dat$sd)] <- median_sd
Option 2: Calculate from CI or SE
# From 95% CI for mean
SD = sqrt(n) * (CI_upper - CI_lower) / (2 * 1.96)
# From SE
SD = SE * sqrt(n)
Option 3: Calculate from IQR (for skewed data)
# Wan et al. method
SD = IQR / 1.35
Option 1: Use reported total N
# If only total N given, assume equal groups
n1 = n2 = N / 2
Option 2: Contact authors
# Add continuity correction (0.5 to all cells)
dat_corrected <- escalc(measure = "OR",
ai = events_treat + 0.5,
bi = nonevents_treat + 0.5,
ci = events_ctrl + 0.5,
di = nonevents_ctrl + 0.5,
data = mydata)
# Or use Peto OR (handles zeros better)
dat_peto <- escalc(measure = "PETO",
ai = events_treat, bi = nonevents_treat,
ci = events_ctrl, di = nonevents_ctrl,
data = mydata)
□ Study identifier (author, year)
□ Sample sizes (treatment and control)
□ Outcome data:
□ Binary: events in each group
□ Continuous: means and SDs
□ Effect size (if pre-calculated)
□ Confidence interval or SE
□ Follow-up duration
□ Subgroup information
□ Risk of bias assessment
# Convert p-value to effect size (approximate)
# Requires sample sizes
z <- qnorm(1 - p_value/2)
d <- z * sqrt(1/n1 + 1/n2)
# Estimate mean and SD (Wan et al. 2014)
# For sample size n:
mean_est <- (q1 + median + q3) / 3
sd_est <- (q3 - q1) / 1.35
# Use SMD to standardize
# This puts all studies on same scale
dat <- escalc(measure = "SMD", ...)
"What statistics does the study report?
"What effect size is appropriate for your research question?
"Now let's calculate the effect size and its variance..."
"Let's double-check:
Basic: "What data do you need to calculate an odds ratio?"
Intermediate: "A study reports mean difference = 5, p = 0.03, n = 50 per group. How do you get the SE?"
Advanced: "Studies use different depression scales (BDI, HDRS). How do you combine them?"
meta-analysis-fundamentals - Understanding effect sizesr-code-generation - Automating calculationsgrade-assessment - Evaluating certainty of evidenceGlass (the teaching agent) MUST adapt this content to the learner:
Example Adaptations: