Name: Cogsci Statistics
Author: NeuroAIHub

搵技能.../

Are your stimuli sampled from a larger population?
 |
 +-- YES --> Mixed-effects model with crossed random effects
 | (subjects and items)
 |
 +-- NO (e.g., fixed set of 4 task conditions) -->
 |
 +-- Any missing data, unbalanced cells, or continuous predictors?
 | |
 | +-- YES --> Mixed-effects model (subjects as random effect)
 | |
 | +-- NO --> Repeated-measures ANOVA is acceptable
 |
 +-- Need trial-level analysis (e.g., RT distributions)?
 |
 +-- YES --> Mixed-effects model (operates on individual trials)
 +-- NO --> Repeated-measures ANOVA on condition means

# Maximal structure (Barr et al., 2013)
lmer(RT ~ A * B + (1 + A * B | Subject) + (1 + A | Item), data = d)

Design	Random Effects	Rationale
Lexical decision (words as items)	`(1 + condition	subj) + (1 + condition
Stroop task (fixed conditions)	`(1 + congruency	subj)`
Picture naming (pictures as items)	`(1 + SOA	subj) + (1
Multi-site study	`(1 + condition	subj) + (1

Criterion	Threshold	Source
Fast outliers (anticipatory)	< 200 ms	Whelan, 2008; Ratcliff, 1993
Slow absolute cutoff	> 2000-3000 ms (task-dependent)	Ratcliff, 1993
Within-subject SD trimming	> 3 SD from participant's condition mean	Van Selst & Jolicoeur, 1994
Within-subject MAD trimming	> 3 MAD from participant's condition median	Leys et al., 2013 (more robust to skew)

Is your primary interest in RT distributions (not just means)?
 |
 +-- YES --> Drift Diffusion Model or ex-Gaussian fitting
 |
 +-- NO --> Choose a modeling approach:
 |
 +-- Option 1: Log-transform RT, then fit LMM (Gaussian)
 | - Pro: Simple, widely understood
 | - Con: Back-transformation of means is biased;
 | changes the hypothesis being tested
 | (Lo & Andrews, 2015)
 |
 +-- Option 2: Inverse-transform RT (1/RT = speed), then LMM
 | - Pro: Often achieves better normality than log
 | - Con: Same back-transformation issues as log
 | (Ratcliff, 1993)
 |
 +-- Option 3 (Recommended): Generalized LMM with
 Gamma family + identity link
 - Pro: Models RT in original units; handles skew
 directly; avoids transformation issues
 (Lo & Andrews, 2015)
 - Con: Computationally slower; may have convergence
 issues with complex random effects

# Recommended RT model (Lo & Andrews, 2015)
glmer(RT ~ condition * group + (1 + condition | subj) + (1 | item),
 family = Gamma(link = "identity"), data = d)

Scenario	Method	Rationale	Source
Small number of planned contrasts (< 5)	No correction or Holm	Planned contrasts based on a priori hypotheses do not require correction if specified before data collection	Rubin, 2021
All pairwise comparisons after ANOVA	Tukey HSD	Controls family-wise error for all pairwise comparisons; assumes equal variance	Tukey, 1953
Many tests, correlated (e.g., EEG channels)	Cluster-based permutation	Respects spatial/temporal correlation structure	Maris & Oostenveld, 2007
Many tests, independent	Bonferroni-Holm	More powerful than Bonferroni; step-down procedure	Holm, 1979
Large-scale testing (fMRI voxels, genomics)	FDR (Benjamini-Hochberg)	Controls false discovery rate rather than family-wise error; appropriate when some false positives are tolerable	Benjamini & Hochberg, 1995
Exploratory whole-brain fMRI	Cluster-level FWE (with cluster-forming threshold p < 0.001)	Eklund et al. (2016) showed that p < 0.01 cluster-forming threshold inflates false positive rates to ~70%	Eklund et al., 2016
Confirmatory ROI analysis in fMRI	Small volume correction (SVC) with FWE	Restricts search space to a priori ROI	Worsley et al., 1996

BF10 Range	Evidence Category	Source
< 1/10	Strong evidence for H0	Jeffreys, 1961; Lee & Wagenmakers, 2013
1/10 to 1/3	Moderate evidence for H0	Lee & Wagenmakers, 2013
1/3 to 3	Anecdotal / inconclusive	Lee & Wagenmakers, 2013
3 to 10	Moderate evidence for H1	Lee & Wagenmakers, 2013
> 10	Strong evidence for H1	Lee & Wagenmakers, 2013

Tool	Use Case	Language
BayesFactor	Standard designs (t-test, ANOVA, correlation, regression)	R
brms	Complex models (multilevel, non-Gaussian, multivariate)	R (Stan backend)
JASP	GUI-based Bayesian analysis for standard tests	Standalone
PyMC	Custom Bayesian models	Python

Test	Effect Size	Interpretation Benchmarks	Source
t-test (between groups)	Cohen's d	0.2 small, 0.5 medium, 0.8 large	Cohen, 1988
t-test (within subjects)	Cohen's d_z or d_av	d_z uses SD of difference scores	Lakens, 2013
One-way ANOVA	eta-squared or omega-squared	0.01 small, 0.06 medium, 0.14 large	Cohen, 1988
Factorial ANOVA	partial eta-squared	0.01 small, 0.06 medium, 0.14 large	Cohen, 1988; Richardson, 2011
Mixed-effects model	semi-partial R-squared	No universal benchmarks; report CI	Rights & Sterba, 2019
Correlation	r	0.1 small, 0.3 medium, 0.5 large	Cohen, 1988
Chi-square	Cramer's V or phi	Depends on df	Cohen, 1988

American Psychological Association. (2020). Publication Manual of the APA (7th ed.).
Appelbaum, M., et al. (2018). Journal article reporting standards for quantitative research. American Psychologist, 73(1), 3-25.
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390-412.
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing. Journal of Memory and Language, 68(3), 255-278.
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate. Journal of the Royal Statistical Society B, 57(1), 289-300.
Clark, H. H. (1973). The language-as-fixed-effect fallacy. Journal of Verbal Learning and Verbal Behavior, 12(4), 335-359.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Erlbaum.
Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7-29.
Dixon, P. (2008). Models of accuracy in repeated-measures designs. Journal of Memory and Language, 59(4), 447-456.
Efron, B., & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman and Hall.
Eklund, A., Nichols, T. E., & Knutsson, H. (2016). Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. PNAS, 113(28), 7900-7905.
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6(2), 65-70.
Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs and toward logit mixed models. Journal of Memory and Language, 59(4), 434-446.
Jeffreys, H. (1961). Theory of Probability (3rd ed.). Oxford University Press.
Kriegeskorte, N., et al. (2009). Circular analysis in systems neuroscience. Nature Neuroscience, 12(5), 535-540.
Kruschke, J. K. (2015). Doing Bayesian Data Analysis (2nd ed.). Academic Press.
Lakens, D. (2013). Calculating and reporting effect sizes. Frontiers in Psychology, 4, 863.
Lee, M. D., & Wagenmakers, E.-J. (2013). Bayesian Cognitive Modeling. Cambridge University Press.
Leys, C., et al. (2013). Detecting outliers: Do not use standard deviation around the mean. Journal of Experimental Social Psychology, 49(4), 764-766.
Lo, S., & Andrews, S. (2015). To transform or not to transform: Using generalized linear mixed models to analyse reaction time data. , 6, 1171.

Cogsci Statistics | Skills Pool

Cogsci Statistics

Cogsci Statistics

Cognitive Science Statistical Analysis

Purpose

When to Use This Skill

Research Planning Protocol

⚠️ Verification Notice

Repeated-Measures ANOVA vs. Mixed-Effects Models

When to Use Repeated-Measures ANOVA

When to Use Mixed-Effects Models (LMM/GLMM)

Decision Logic

Random Effects Structure

The Maximal Random Effects Principle

When Maximal Models Fail to Converge

Common Cognitive Science Designs and Their Random Effects

Handling Reaction Time Data

RT Outlier Exclusion

RT Transformation and Modeling Strategy

Multiple Comparison Correction

Decision Guide for Cognitive Science

When NOT to Correct

Bayesian Alternatives

When to Use Bayesian Analysis

Bayes Factor Interpretation

Recommended Tools

Reporting Bayes Factors

Effect Size Reporting

APA 7th Edition Requirements

For Mixed-Effects Models

Common Statistical Mistakes in Cognitive Science

1. Treating Items as Fixed Effects

2. Circular Analysis ("Double-Dipping") in Neuroimaging

3. Analyzing Accuracy with ANOVA Instead of Logistic Models

4. Inappropriate Outlier Exclusion

5. Running ANOVAs on RT Without Addressing Skew

6. Using Uncorrected Cluster-Forming Thresholds in fMRI

7. Reporting Correlation P-Values Without CIs

Minimum Statistical Reporting Checklist

References

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns