Study-type module for observational studies (cohort, case-control, cross-sectional). Provides reporting guidance aligned with STROBE, including methods subsection templates, results conventions for crude and adjusted estimates, discussion paragraph frameworks, and software citation standards.
Guideline: STROBE (Strengthening the Reporting of Observational Studies in Epidemiology)
Key reference: von Elm E, Altman DG, Egger M, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet. 2007;370(9596):1453-1457. doi:10.1016/S0140-6736(07)61602-X
STROBE has three design-specific versions: cohort, case-control, and cross-sectional. Items marked with an asterisk (*) differ between designs. Announce guideline selection and the specific design version before writing; allow the user to override.
See references/strobe-checklist.md for the full item-level checklist.
Write Methods in past tense. Passive voice is conventional; active voice is acceptable when it improves clarity. Never invent procedural details; flag unknowns with .
[PLEASE VERIFY: ___]State the study design (cohort, case-control, or cross-sectional), the rationale for choosing that design, the setting (institution, geography, healthcare system), and the dates of data collection, recruitment, exposure, and follow-up.
Template sentences:
"We conducted a [retrospective / prospective] [cohort / case-control / cross-sectional] study using data from [DATA SOURCE] between [START DATE] and [END DATE]."
"The study was conducted at [SETTING] in [CITY/COUNTRY]. [DATA SOURCE] captures [DESCRIPTION OF POPULATION COVERED]."
Checklist:
For cohort studies: eligibility criteria, sources and methods of selection, methods of follow-up. For case-control studies: case definition and source, control selection method and rationale (separately). For cross-sectional studies: eligibility criteria, sources and methods of selection.
Template sentences (cohort):
"We included all [ADULTS / PATIENTS] aged [RANGE] who [CRITERIA] between [DATES]. Participants were followed from [INDEX DATE] until [OUTCOME / CENSORING EVENT / END OF FOLLOW-UP], whichever occurred first."
Template sentences (case-control):
"Cases were defined as [OUTCOME DEFINITION] identified from [SOURCE]. For each case, [N] controls were randomly selected from [SOURCE], matched on [VARIABLES], and confirmed to be free of [OUTCOME] at the index date."
Checklist:
Define all outcomes, exposures, predictors, potential confounders, and effect modifiers. Describe measurement methods and diagnostic criteria.
Template sentence:
"The primary outcome was [OUTCOME], defined as [DEFINITION, including ICD codes or clinical criteria]. The primary exposure was [EXPOSURE], defined as [DEFINITION]. Potential confounders included [LIST], selected a priori based on [clinical knowledge / directed acyclic graph (DAG) / published literature]."
Checklist:
Describe each data source, validation methods, and measurement reliability.
Template sentence:
"[EXPOSURE / OUTCOME] data were obtained from [SOURCE]. [Validation statement, e.g., 'Diagnoses were validated against medical records with a positive predictive value of [X]%']. Laboratory assays were performed at [LABORATORY] using [METHOD]."
Checklist:
Describe efforts to address potential sources of bias. State known biases and the direction of their expected effect.
Template sentence:
"To address [BIAS TYPE], we [METHOD]. [E.g., 'To reduce confounding by indication, we used propensity score matching on [N] pre-treatment covariates.' / 'To minimise healthy user bias, we restricted the comparator group to active users of [ALTERNATIVE].']"
Target trial emulation framework (v1.0.0):
"The study was designed to emulate a hypothetical target trial of [INTERVENTION] vs [COMPARATOR] in [POPULATION], following the framework of Hernan and Robins. The target trial protocol specifying eligibility criteria, treatment strategies, assignment, follow-up, outcome, and causal contrast is detailed in [Table / Supplement]."
E-value for unmeasured confounding (v1.0.0):
"To assess the robustness of findings to unmeasured confounding, we calculated the E-value, defined as the minimum strength of association on the risk ratio scale that an unmeasured confounder would need to have with both the exposure and the outcome to fully explain away the observed association."
Checklist:
Explain how the study size was arrived at. For primary studies, report a formal sample size calculation. For database studies, state whether the available sample was used in its entirety.
Template sentence:
"A minimum sample size of [N] participants ([N] events) was required to detect a [EFFECT SIZE] with [POWER]% power at a two-sided alpha of [ALPHA], assuming [ASSUMPTIONS]. [OR: 'All eligible patients in the database during the study period were included; no a priori sample size calculation was performed.']"
Checklist:
Describe all statistical methods, including regression model type, adjustment strategy, missing data handling, and sensitivity analyses.
Template sentences:
"The association between [EXPOSURE] and [OUTCOME] was estimated using [Cox proportional hazards / multivariable logistic / Poisson] regression, adjusting for [CONFOUNDERS]. The proportional hazards assumption was assessed using Schoenfeld residual tests."
"Propensity scores for [EXPOSURE] were estimated using logistic regression including [N] pre-treatment covariates. [Matching: Participants were matched 1:1 on the logit of the propensity score using a caliper of 0.2 standard deviations.] [Weighting: Inverse probability of treatment weighting (IPTW) was applied, with weights stabilised and truncated at the 1st and 99th percentiles.] Balance was assessed using standardised mean differences (SMD), with SMD < 0.1 indicating adequate balance."
"Missing data were handled using [complete case analysis / multiple imputation by chained equations (MICE) with m = [N] imputations / last observation carried forward]."
Checklist:
Report the total eligible population, exclusions (with reasons), and the final analytic sample. For cohort studies, report median follow-up time and total person-years. Reference a flow diagram if applicable.
Report by exposure group. Include: age (mean +/- SD or median [IQR]), sex, ethnicity, key clinical variables. Report standardised mean differences for propensity score analyses. Do NOT test for baseline balance using p-values in observational studies with large samples (SMD is preferred).
For each primary outcome:
Effect estimate by design:
Report each alternative estimate with its own CI and interpretation. State the direction of change relative to the base case.
Report stratified estimates with interaction p-values. Do not claim effect modification without a significant interaction test.
"In this [cohort / case-control / nested case-control] study of [N] [adults / patients / participants], [exposure] was associated with [outcome] (adjusted [OR/HR/RR]: [estimate]; 95% CI: [lower], [upper]), after adjustment for [key confounders]. [Sensitivity analysis result]. [Subgroup finding with interaction p-value]."
Consistent:
"These findings are consistent with [Author] et al., who reported [estimate] in a [study design] of [N] [population]..."
Divergent:
"In contrast, [Author] et al. found [estimate], which may reflect differences in [population / exposure definition / confounding control / follow-up duration]..."
Extending:
"The present study extends prior work by [novel contribution -- e.g., using target trial emulation / applying E-values / examining a previously unstudied population]..."
For each limitation, state: (1) the limitation precisely, (2) likely direction of bias, (3) impact on conclusions.
R:
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. [Year]. Available from: https://www.R-project.org/
Stata:
StataCorp. Stata Statistical Software: Release [XX]. StataCorp LLC, College Station, TX. [Year].
SAS:
SAS Institute Inc. SAS/STAT Software, Version [XX]. SAS Institute Inc., Cary, NC. [Year].
R packages (common for observational studies):
survival:
Therneau TM. A Package for Survival Analysis in R. R package version [X.X]. Available from: https://CRAN.R-project.org/package=survival
MatchIt (propensity score matching):
Ho DE, Imai K, King G, Stuart EA. MatchIt: Nonparametric Preprocessing for Parametric Causal Inference. J Stat Softw. 2011;42(8):1-28. doi:10.18637/jss.v042.i08
mice (multiple imputation):
van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate Imputation by Chained Equations in R. J Stat Softw. 2011;45(3):1-67. doi:10.18637/jss.v045.i03
EValue (E-value calculation):
Mathur MB, Ding P, Riddell CA, VanderWeele TJ. Web Site and R Package for Computing E-values. Epidemiology. 2018;29(5):e45-e47. doi:10.1097/EDE.0000000000000864
WeightIt (propensity score weighting):
Greifer N. WeightIt: Weighting for Covariate Balance in Observational Studies. R package. Available from: https://CRAN.R-project.org/package=WeightIt