This skill encodes expert methodological knowledge for conducting parameter recovery studies -- a critical validation step before interpreting fitted model parameters. Parameter recovery determines whether a model's parameters are identifiable given the experimental design and sample size. A general-purpose programmer unfamiliar with computational modeling would not know that fitting a model is insufficient validation, or how to diagnose parameter tradeoffs and non-identifiability.
When to Use This Skill
Before trusting fitted parameter values from any computational cognitive model
When developing a new model and assessing whether parameters can be distinguished from data
When planning an experiment and determining the minimum trial count for reliable parameter estimation
When a reviewer asks for evidence of model identifiability
When comparing models and needing to ensure each model can be distinguished (model recovery)
When fitted parameters produce suspiciously extreme values or hit bounds
Research Planning Protocol
Skills relacionados
Before executing the domain-specific steps below, you MUST:
State the research question -- What specific question is this analysis/paradigm addressing?
Justify the method choice -- Why is this approach appropriate? What alternatives were considered?
Declare expected outcomes -- What results would support vs. refute the hypothesis?
Note assumptions and limitations -- What does this method assume? Where could it mislead?
Present the plan to the user and WAIT for confirmation before proceeding.
For detailed methodology guidance, see the research-literacy skill.
⚠️ Verification Notice
This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.
Why Parameter Recovery Matters
Fitting a model to data and obtaining parameter estimates does NOT guarantee those estimates are meaningful (Wilson & Collins, 2019; Navarro, 2019). Common failure modes:
Non-identifiability: Multiple parameter combinations produce identical model predictions (e.g., drift rate and boundary in DDM trade off; Ratcliff & Tuerlinckx, 2002)
Insufficient data: Too few trials for the fitting procedure to recover true values
Local minima: Optimization converges to wrong parameter values
Model misspecification: The fitting procedure recovers parameters that do not reflect the assumed cognitive process
Parameter recovery is the standard diagnostic for these problems (Heathcote et al., 2015; Wilson & Collins, 2019).
Step-by-Step Recovery Procedure
Step 1: Define the Parameter Space
Choose ground-truth parameter values that span the plausible range for each parameter.
How many parameter sets to simulate?
|
+-- Minimum: 100 parameter sets (Wilson & Collins, 2019)
|
+-- Recommended: 500-1000 parameter sets for smooth recovery landscapes
|
+-- For publication: 1000+ parameter sets (Heathcote et al., 2015)
Sampling strategy:
Strategy
When to Use
Source
Uniform grid
Few parameters (1-2), want complete coverage
Standard practice
Latin hypercube
3+ parameters, want space-filling without excessive samples
McKay et al., 1979
Random uniform
Simple, adequate for many parameters
Wilson & Collins, 2019
Prior-based sampling
Have informative priors on parameter ranges
Palestro et al., 2018
Range selection: Use ranges from published parameter estimates in the domain. For example:
Match the experimental design exactly -- Same number of trials, conditions, and structure as the real experiment
Use the same model -- The generative model must be identical to the model you will fit
Include realistic noise -- Use the model's noise mechanism (do not add external noise)
Store the ground-truth parameters for later comparison
Critical: The number of simulated trials per participant must match the actual experiment. Recovery with 10,000 trials tells you nothing about recovery with 100 trials (Wilson & Collins, 2019).
Step 3: Fit the Model to Simulated Data
Apply the exact same fitting procedure you use for real data:
Same optimization algorithm (e.g., MLE, Bayesian, chi-square minimization)
Same parameter bounds and constraints
Same starting values or initialization strategy
Same convergence criteria
Multiple starting points: Run the optimizer from at least 5-10 random starting points per simulated dataset to avoid local minima (Heathcote et al., 2015).
Step 4: Evaluate Recovery Quality
Compare recovered parameters to true (ground-truth) parameters using multiple metrics.
Primary Metrics
Metric
Formula
Good
Acceptable
Concerning
Source
Pearson correlation (r)
cor(true, recovered)
r > 0.9
r > 0.8
r < 0.7
Heathcote et al., 2015; rough benchmarks
Bias
mean(recovered - true)
Near 0
< 10% of range
> 20% of range
Wilson & Collins, 2019
RMSE
sqrt(mean((recovered - true)^2))
Small relative to range
--
Large relative to range
Standard
Coverage
% of 95% CIs containing true value
~95%
85-100%
< 80%
Bayesian recovery
Visualization (essential)
Scatter plot: Recovered vs. true for each parameter (identity line = perfect recovery)
Bland-Altman plot: Difference vs. mean (detect range-dependent bias)
See references/recovery-diagnostics.md for visualization templates.
Step 5: Check Parameter Tradeoffs
Correlation between recovered parameters:
Are any pairs of recovered parameters correlated |r| > 0.5?
|
+-- YES --> These parameters trade off. Consider:
| - Fixing one to a theoretically motivated value
| - Reparameterizing the model
| - Collecting more data to improve identifiability
| - Reporting the tradeoff and interpreting cautiously
|
+-- NO --> Parameters are identifiable given this design
Common parameter tradeoffs in cognitive models:
Model
Correlated Parameters
Nature of Tradeoff
Source
DDM
Drift rate (v) and boundary (a)
Speed-accuracy tradeoff
Ratcliff & Tuerlinckx, 2002
DDM
Non-decision time (Ter) and boundary (a)
Boundary absorbs timing variance
Ratcliff & Tuerlinckx, 2002
ACT-R
Noise (s) and threshold (tau)
Both affect retrieval probability
Anderson, 2007
RL models
Learning rate (alpha) and inverse temperature (beta)
Both control exploitation
Daw, 2011
Signal detection
d-prime and criterion (c)
Criterion shift mimics sensitivity change
Macmillan & Creelman, 2005
Model Recovery (Confusion Matrix)
Model recovery extends parameter recovery to test whether the correct model can be identified from data (Wagenmakers et al., 2004).
Procedure
For each candidate model M_k (k = 1, ..., K):
a. Simulate data from M_k with representative parameters
b. Fit ALL candidate models to the simulated data
c. Select the best-fitting model using your comparison metric (AIC, BIC, Bayes factor)
Construct a K x K confusion matrix: rows = generating model, columns = selected model
Diagonal entries should dominate (correct model selected)
Quality Criteria
Metric
Good
Concerning
Source
Diagonal proportion
> 90% correct
< 70% correct
Wagenmakers et al., 2004
Off-diagonal patterns
Symmetric confusion
Asymmetric (one model always "wins")
Wilson & Collins, 2019
Warning: If model A is selected when data are generated from model B more than 20% of the time, those models are not distinguishable with your experimental design (Wilson & Collins, 2019).
Sample Size Effects
How Trial Count Affects Recovery
Recovery quality improves with more trials per participant. Test recovery at multiple trial counts:
Trial Count
Expected Recovery
Recommendation
< 50 trials
Often poor (r < 0.7)
Increase trials or simplify model
50-100 trials
Marginal for simple models
May suffice for 2-3 parameter models
100-200 trials
Adequate for most models
Standard for DDM (Ratcliff & McKoon, 2008)
200-500 trials
Good for complex models
Recommended for models with > 4 parameters
500+ trials
Excellent for most models
Required for hierarchical models
Source: Wilson & Collins (2019); Ratcliff & Tuerlinckx (2002) for DDM-specific guidance.
Recovery as a Function of N
Plot recovery metrics (r, RMSE) as a function of trial count to determine the minimum viable N for your specific model and paradigm.
Landscape Analysis
Parameter Sensitivity Surfaces
For 1-2 key parameters, compute and visualize the objective function surface:
Fix all parameters except the target parameter(s)
Evaluate the objective function (e.g., negative log-likelihood) at a grid of values
Plot the surface (1D: line; 2D: contour or heatmap)
What to look for:
Surface Feature
Interpretation
Action
Single sharp minimum
Well-identified parameter
Proceed with confidence
Broad flat minimum
Parameter poorly constrained
Widen prior or collect more data
Multiple minima
Non-convex; local minima risk
Use multiple starting points; consider reparameterization
Ridge (elongated valley)
Parameter tradeoff
Two parameters are correlated; consider fixing one
Reporting Standards
Minimum Reporting Checklist
When publishing a parameter recovery study:
Number of simulated parameter sets (minimum 100; Wilson & Collins, 2019)
Sampling strategy for ground-truth parameters (uniform, LHS, prior-based)
Range of ground-truth values for each parameter (with justification)
Number of simulated trials per dataset (must match real experiment)
Fitting procedure used (same as for real data)
Number of starting points for optimization
Recovery metrics for each parameter: correlation (r), bias, RMSE
Scatter plots: recovered vs. true for each parameter
Model recovery confusion matrix (if performing model comparison)
Recovery as a function of trial count (if applicable)
Where to Report
Main text: Summary of recovery quality (r values, key plots)
Supplementary: Full correlation matrices, all scatter plots, landscape analyses
Parameter recovery is increasingly expected in top journals (Wilson & Collins, 2019; Navarro, 2019)
Common Pitfalls
Testing recovery with too many trials: Simulating 10,000 trials when the experiment has 100. Recovery will look excellent but is irrelevant to your actual data (Wilson & Collins, 2019).
Using different fitting procedures: The recovery study must use the identical optimization pipeline as the real-data analysis. Different starting values, bounds, or algorithms invalidate the test.
Ignoring parameter correlations: High marginal recovery (good r for each parameter) can coexist with strong parameter tradeoffs that distort interpretation. Always check the cross-parameter correlation matrix.
Reporting only correlation: Correlation measures rank-order recovery but ignores systematic bias. A parameter can have r = 0.95 but be consistently overestimated by 30%. Report bias and RMSE alongside r.
Sampling only near defaults: If ground-truth values cluster around typical defaults, recovery may look good only in that region. Sample across the full plausible range.
Neglecting model recovery: Good parameter recovery does not guarantee good model recovery. Two models can have recoverable parameters individually but be indistinguishable when competing (Wagenmakers et al., 2004).
Confusing identifiability with validity: A model can have perfectly recoverable parameters and still be a poor model of cognition. Recovery is necessary but not sufficient (Navarro, 2019).
References
Anderson, J. R. (2007). How Can the Human Mind Occur in the Physical Universe? Oxford University Press.
Daw, N. D. (2011). Trial-by-trial data analysis using computational models. In M. R. Delgado, E. A. Phelps, & T. W. Robbins (Eds.), Decision Making, Affect, and Learning. Oxford University Press.
Heathcote, A., Brown, S. D., & Wagenmakers, E.-J. (2015). An introduction to good practices in cognitive modeling. In B. U. Forstmann & E.-J. Wagenmakers (Eds.), An Introduction to Model-Based Cognitive Neuroscience. Springer.
Macmillan, N. A., & Creelman, C. D. (2005). Detection Theory: A User's Guide (2nd ed.). Lawrence Erlbaum Associates.
McKay, M. D., Beckman, R. J., & Conover, W. J. (1979). A comparison of three methods for selecting values of input variables. Technometrics, 21(2), 239-245.
Navarro, D. J. (2019). Between the devil and the deep blue sea: Tensions between scientific judgement and statistical model selection. Computational Brain & Behavior, 2(1), 28-34.
Palestro, J. J., Sederberg, P. B., Osth, A. F., Van Zandt, T., & Turner, B. M. (2018). Likelihood-free methods for cognitive science. Springer.
Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873-922.
Ratcliff, R., & Tuerlinckx, F. (2002). Estimating parameters of the diffusion model. Psychonomic Bulletin & Review, 9(3), 438-481.
Wagenmakers, E.-J., Ratcliff, R., Gomez, P., & Iverson, G. J. (2004). Assessing model mimicry using the parametric bootstrap. Journal of Mathematical Psychology, 48(1), 28-50.
Wilson, R. C., & Collins, A. G. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8, e49547.
See references/ for diagnostic visualization templates and worked examples.