Guide for researching epidemiological parameters, identifying calibration targets, and determining model structure for spatial disease transmission models. Use when the user needs to find R0, infectious/latent periods, birth/death rates, vaccination coverage, observed data for calibration, or guidance on model component selection (compartment type, seasonal forcing, spatial coupling, vaccination strategy). Trigger phrases include "research parameters", "what parameters do I need", "find R0 for", "epidemiological data sources", "disease parameters", "parametrize model", "calibration targets", "model structure for", "parameter space", "what data do I need".
This skill produces three structured outputs for downstream use:
Parameter Space → calabaria ParameterSpace + ConfigurationSpace
Calibration Targets → observed data + loss function design
Model Structure Guidance → LASER component selection
| Parameter | Typical Range | Red Flags |
|---|---|---|
| R0 | 1.5–18 depending on disease | <1 means no sustained transmission |
| Latent period | 2–21 days | >30 days unusual for acute infections |
| Infectious period | 3–14 days | >30 days → chronic, different model needed |
| CBR | 8–50 per 1000/yr | <1 or >60 likely unit error |
| CDR | 3–20 per 1000/yr | Same unit check as CBR |
| Vaccination coverage | 0.0–1.0 | >1.0 is a unit error (not percentage) |
| Generation interval | 5–25 days | Should ≈ latent + infectious/2 for SEIR |
| Infection-to-case ratio | 1:1–1:2000 | Disease-specific; >1:100 common for enteric pathogens |
Structure the research as calabaria-compatible parameter spaces:
from calabaria.parameters import ParameterSpace, ParameterSpec
from calabaria.parameters import ConfigurationSpace, ConfigSpec
# Calibration parameters (uncertain → ranges for Optuna)
PARAMS = ParameterSpace([
ParameterSpec("beta", lower=<lower>, upper=<upper>, kind="float",
doc="<source: Author et al. YYYY, range justification>"),
ParameterSpec("gravity_k", lower=<lower>, upper=<upper>, kind="float",
doc="<source>"),
# ... additional uncertain parameters
])
# Fixed parameters (well-known → point estimates)
CONFIG = ConfigurationSpace([
ConfigSpec("latent_period_mean", default=<value>,
doc="<source: WHO/PubMed systematic review>"),
ConfigSpec("infectious_period_mean", default=<value>,
doc="<source>"),
ConfigSpec("cbr", default=<value>,
doc="<source: UN World Population Prospects YYYY>"),
# ... additional fixed parameters
])
Key principle: A parameter is uncertain (PARAMS) if the literature shows a wide range or strong context-dependence. It is fixed (CONFIG) if there is consensus from multiple systematic reviews.
The calibration target is the observed data you want your model to reproduce. Finding the right target is as important as finding the right parameters.
| Data Type | Example | Loss Function | Alignment |
|---|---|---|---|
| Case incidence time series | Weekly reported cases by district | MSE on log-transformed counts | Temporal (week) × spatial (district) |
| Annual/total case counts | Total cases per province per year | Poisson or negative binomial likelihood | Spatial × annual |
| Seroprevalence | Fraction immune by age group | Beta-binomial likelihood | Age × spatial |
| Extinction/persistence | Fraction of weeks with zero cases | Logistic curve fit (CCS similarity) | Spatial (by population size) |
| Spatial spread patterns | Phase lags in wavelet analysis | Phase difference similarity | Spatial (distance from epicenter) |
| Vaccination impact | Pre/post-campaign case reduction | Relative reduction ratio | Temporal (before/after) |
| Source | What It Provides | Format | URL/Access |
|---|---|---|---|
| WHO GHO | National incidence, mortality by year | CSV/API | gho.who.int |
| GPEI / POLIS | Subnational polio case counts, AFP surveillance | CSV | polioeradication.org |
| DHS | Seroprevalence, vaccination coverage surveys | Stata/CSV | dhsprogram.com |
| UN WPP | Population, CBR, CDR, age pyramids by country | CSV/Excel | population.un.org |
| GBD / IHME | Disease burden estimates by country/year | CSV | ghdx.healthdata.org |
| Ministry of Health reports | Subnational case data, outbreak reports | PDF/tables | Country-specific |
| Published studies | Age-stratified seroprevalence, outbreak curves | Extract from papers | PubMed |
import polars as pl
# Observed data as polars DataFrame
# Schema should match what your @model_output methods produce
observed = pl.DataFrame({
"year": [2015, 2015, 2016, 2016, ...],
"patch": [0, 1, 0, 1, ...],
"cases": [45, 12, 38, 8, ...],
})
# Loss function for calibration loop
def compute_loss(model_output_df: pl.DataFrame, observed_df: pl.DataFrame) -> float:
"""Compare model weekly_incidence output to observed data.
Returns single float loss for TrialResult.
"""
joined = model_output_df.join(observed_df, on=["year", "patch"], suffix="_obs")
log_model = (joined["cases"] + 1).log()
log_obs = (joined["cases_obs"] + 1).log()
return ((log_model - log_obs) ** 2).mean()
Most surveillance data captures only a fraction of true infections:
| Disease | Typical Detection Ratio | Source |
|---|---|---|
| Measles | 1:3–1:10 (varies by surveillance quality) | WHO |
| Polio (WPV1) | 1:200 (paralysis:infection) | CDC |
| Cholera | 1:4–1:25 | WHO |
| Influenza | 1:10–1:100 | CDC |
Approach: Either multiply observed cases by the inverse ratio to estimate true incidence, or multiply model infections by the detection ratio to estimate reported cases. Document which direction you chose.
Research the disease biology and transmission ecology to answer these questions:
Does infection confer lasting immunity?
├── Yes
│ ├── Is there a latent (non-infectious) period?
│ │ ├── Yes → SEIR
│ │ └── No → SIR
│ └── Does immunity wane on simulation timescale?
│ ├── Yes → SEIRS or SIRS
│ └── No → SEIR or SIR
└── No → SIS
| Driver | Profile Shape | Examples |
|---|---|---|
| School terms | Biweekly step function (Bjornstad) | Measles, influenza (temperate) |
| Climate/monsoon | Cosine peaking in wet/warm season | Cholera, dengue, malaria |
| Behavioral | Holiday/pilgrimage calendar | Meningitis belt (dry season gatherings) |
| None apparent | Flat (ValuesMap.from_scalar(1.0, ...)) | Some chronic infections |
| Unknown | Include seasonal_amplitude in PARAMS | Let calibration decide |
| Model | When to Use | LASER Function |
|---|---|---|
| Gravity | Default for human diseases; well-studied | gravity() |
| Radiation | When intervening populations create barriers | radiation() |
| Competing destinations | When destinations compete for travelers | competing_destinations() |
| None | Single-patch or well-mixed | Set gravity_k = 0 |
| Strategy | Component | Key Parameters |
|---|---|---|
| Routine only | RoutineImmunizationEx | Age at vaccination, coverage |
| Campaign only | VaccinationCampaign | Period, coverage, age band |
| Both | Both components | Consider correlated missedness |
| None | Omit vaccination components | Pre-vaccine era modeling |
Correlated missedness: If hard-to-reach populations consistently miss both routine and campaign vaccination, use the reachable flag pattern from custom_components.py.
| Pattern | When to Use | Configuration |
|---|---|---|
| Endemic corridor | Continuous cross-border transmission | end_tick = nticks, moderate count |
| Stochastic reintroduction | Sustain sub-CCS patches | Moderate period, low count |
| Initial seeding only | No ongoing importation | Seed I > 0 in scenario GeoDataFrame only |
| None | Fully closed system | Omit importation component |
Define the modeling question before researching parameters:
Search strategy by parameter type:
| Parameter Type | Primary Sources | Search Terms |
|---|---|---|
| R0, generation interval | PubMed systematic reviews | "{disease} R0 systematic review" |
| Latent/infectious period | WHO disease factsheets, PubMed | "{disease} incubation period", "{disease} serial interval" |
| Demographics (CBR, CDR) | UN World Population Prospects | "{country} crude birth rate" |
| Vaccination coverage | DHS, MICS, WHO/UNICEF estimates | "{country} {disease} vaccine coverage subnational" |
| Population by patch | National census, WorldPop | "{country} district population" |
| Seasonal forcing | Published modeling studies | "{disease} seasonal transmission {region}" |
Cross-reference rule: Every parameter should have at least two independent sources. Flag parameters with only one source or wide disagreement as calibration candidates.
laser-spatial-disease-modeling skillUnit checks (critical):
Documentation:
ParameterSpec.doc and ConfigSpec.doc fields| Parameter | Value/Range | Unit | Source | Uncertainty |
|---|---|---|---|---|
| R0 | 12-18 | dimensionless | Systematic review (Author YYYY) | Calibrate |
| Latent period | 8-13 days | days | WHO factsheet | Fixed at 10 |
| CBR | 29.2 | per 1000/yr | UN WPP 2024 | Fixed |
| ... | ... | ... | ... | ... |
When parametrization is complete, hand off to:
laser-spatial-disease-modeling: Use Output 3 (model structure) to select LASER components, and Output 1 (parameter space) to configure the modelmodelops-calabaria: Use Output 1 (ParameterSpace/ConfigurationSpace) directly as BaseModel class attributes, and Output 2 (calibration targets) to design the loss function