Use this Skill for political survey analysis: complex sampling with weights, ANES/CCES/ESS data loading, weighted logit/ordered logit, and cross-national equivalence testing.
Use this skill when you need to:
This skill is not a replacement for dedicated survey software (Stata svy, R survey package).
It provides Python implementations suitable for reproducible research workflows.
Political surveys rarely use simple random sampling. The ANES, for example, uses a stratified multi-stage area probability sample. Ignoring the complex design produces understated standard errors and invalid inference. Three design features matter:
| Feature | Effect if ignored |
|---|---|
| Probability weights | Biased point estimates |
| Stratification | Overestimated standard errors |
| Clustering (PSU) | Underestimated standard errors |
Design-based vs. model-based SE: Design-based inference treats the finite population as fixed and the sample selection as random. Model-based inference conditions on the sample and assumes a data-generating process. For descriptive inference about populations, design-based SE is preferred.
ANES structure: Each respondent has a weight variable (e.g., V201617x in 2020 ANES). The
pre-election and post-election waves have separate weights. Weights sum to the target population
(eligible voters or adult citizens).
ESS structure: Multi-country survey with a design weight (dweight) correcting for unequal
selection probabilities within countries, and a post-stratification weight (pspwght). For
cross-national analysis, use pweight (population size weight) to make country samples
proportional to national populations.
Raking (iterative proportional fitting): When post-stratification requires simultaneous calibration on multiple marginal distributions (age × gender × education), raking iterates through each marginal until convergence. The resulting weights satisfy all marginal totals simultaneously.
Ordered logit for Likert outcomes: Survey items often use 5- or 7-point scales. OLS treats the ordinal scale as metric; ordered logit respects the ordinal nature and estimates cut-points between categories.
Measurement equivalence across countries proceeds in steps:
pip install pandas>=1.5 statsmodels>=0.14 scipy>=1.9 numpy>=1.23 matplotlib>=3.6
For ANES data, download the .dta (Stata) or .sav (SPSS) file from