Domain-validated decision logic, formulas, and interpretation guidelines for applying Signal Detection Theory to cognitive science data
This skill encodes expert methodological knowledge for applying Signal Detection Theory (SDT) to behavioral and cognitive science data. SDT separates an observer's perceptual sensitivity from their decision criterion -- a distinction that raw accuracy conflates. A competent programmer without cognitive science training would typically compute percent correct, missing the critical insight that two observers with identical accuracy can differ drastically in their ability to detect signals vs. their willingness to say "yes."
Use SDT whenever:
Do not use standard SDT when:
Before executing the domain-specific steps below, you MUST:
For detailed methodology guidance, see the research-literacy skill.
This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.
Every SDT analysis begins with classifying each trial into one of four categories:
| Signal Present | Signal Absent | |
|---|---|---|
| "Yes" Response | Hit (H) | False Alarm (FA) |
| "No" Response | Miss (M) | Correct Rejection (CR) |
From these four cells, compute two rates:
d' measures the distance between the signal and noise distributions in standard deviation units, assuming equal-variance Gaussian distributions (Green & Swets, 1966, Ch. 1):
d' = z(Hit Rate) - z(False Alarm Rate)
where z() is the inverse of the standard normal CDF (the z-transform).
What d' values mean in practice (Macmillan & Creelman, 2005; Table 1.1):
| d' Value | Yes/No Interpretation | 2AFC % Correct | Practical Meaning |
|---|---|---|---|
| 0 | Chance performance | 50% | No discrimination ability |
| 0.5 | Low sensitivity | ~60% | Barely above chance |
| 1.0 | Moderate sensitivity | ~69% | Often used as threshold (Green & Swets, 1966, Ch. 4) |
| 2.0 | Good sensitivity | ~84% | Reliable discrimination |
| 2.5 | High sensitivity | ~90% | Strong discrimination |
| 3.0+ | Near-ceiling | >93% | Approaching perfect; check for floor/ceiling issues |
The typical experimental range avoiding floor/ceiling effects is d' = 0.5 to 2.5 (Macmillan & Creelman, 2005).
SDT provides three interchangeable bias measures. The choice matters when d' varies across conditions.
Criterion location c (Macmillan & Creelman, 2005, Ch. 2):
c = -0.5 x [z(Hit Rate) + z(False Alarm Rate)]
Likelihood ratio beta (Green & Swets, 1966, Ch. 1):
ln(beta) = d' x c
Relative criterion c' (Macmillan & Creelman, 2005, Ch. 2):
c' = c / d'
Normalizes criterion placement by sensitivity; useful when comparing bias across conditions with different d' values.
Which bias measure to use (Macmillan & Creelman, 2005, Ch. 2):
Is the task a single-interval (yes/no) design?
|
+-- YES --> Are assumptions of equal-variance Gaussian distributions met?
| |
| +-- YES --> Use d' = z(H) - z(FA) (Green & Swets, 1966)
| |
| +-- NO, distributions have unequal variance
| | --> Use da with estimated variance ratio
| | (Macmillan & Creelman, 2005, Ch. 3)
| |
| +-- NO, distributions are non-Gaussian or unknown
| --> Use Az (area under the ROC curve)
| (Swets, 1986; Macmillan & Creelman, 2005, Ch. 3)
|
+-- NO --> Is it a two-interval forced choice (2AFC/2IFC)?
|
+-- YES --> d'(2AFC) = z(proportion correct) x sqrt(2)
| (Green & Swets, 1966, Ch. 6; Macmillan & Creelman, 2005, Ch. 5)
|
+-- NO --> Is it same-different or ABX?
|
+-- YES --> Use paradigm-specific formulas
| (see references/sdt-formulas.md)
|
+-- NO --> Is it a rating-scale (confidence) design?
|
+-- YES --> Construct ROC from rating data;
use Az or fit parametric model
(Macmillan & Creelman, 2005, Ch. 3)
Use the area under the ROC curve (Az) when:
AUC benchmarks (Swets, Dawes, & Monahan, 2000):
| AUC Range | Interpretation |
|---|---|
| 0.50 | Chance (no discrimination) |
| 0.70 - 0.80 | Fair diagnostic accuracy |
| 0.80 - 0.90 | Good diagnostic accuracy |
| 0.90 - 1.00 | Excellent diagnostic accuracy |
The canonical SDT paradigm. On each trial, either a signal or noise is presented; the observer responds "yes" (signal present) or "no" (signal absent). Yields H and FA rates directly.
Two intervals are presented (one signal, one noise); the observer selects the signal interval. Only proportion correct is measured; there is no independent FA rate, and no bias measure can be computed.
Critical domain pitfall: A task where the observer chooses between two labels (e.g., "left" or "right") on a single stimulus is not a 2AFC -- it is a yes/no task in disguise (Macmillan & Creelman, 2005). True 2AFC requires two temporal or spatial intervals.
Observers make a detection judgment plus a confidence rating (e.g., 1-6 scale from "sure noise" to "sure signal"). Each confidence boundary yields a separate (H, FA) pair, constructing a multi-point ROC.
Two stimuli are presented; the observer judges "same" or "different." Two observer models exist (Macmillan & Creelman, 2005, Ch. 6):
These yield different d' formulas; see references/sdt-formulas.md.
Stimulus A, then B, then X (which matches A or B); the observer identifies X. Sensitivity depends on assumed observer strategy (Macmillan & Creelman, 2005, Ch. 6). See references/sdt-formulas.md.
When H = 1.0 or FA = 0.0, z-scores become infinite and d' is undefined. This is a common computational pitfall that requires correction.
1. The 1/(2N) rule (Macmillan & Kaplan, 1985):
2. The log-linear rule (Hautus, 1995) -- recommended:
Which to use: The log-linear rule is preferred because it produces less biased d' estimates and avoids the asymmetric bias of the 1/(2N) rule, which can either over- or underestimate d' (Hautus, 1995). Apply the log-linear correction routinely, not just when extremes occur, for consistency across participants and conditions.
Standard d' assumes signal and noise distributions have equal variance. In recognition memory, this assumption is routinely violated: zROC slopes are typically ~0.80 (not 1.0), indicating the old-item (target) distribution has ~25% more variance than the new-item (lure) distribution (Ratcliff, Sheu, & Gronlund, 1992; Mickes, Wixted, & Wais, 2007).
If variances are unequal and you compute standard d', the measure is not criterion-free -- it will vary with criterion placement even if true sensitivity is constant (Macmillan & Creelman, 2005, Ch. 3).
references/sdt-formulas.mdUsing percent correct instead of d': Percent correct confounds sensitivity and bias. Two observers with identical discrimination ability but different criteria will have different accuracy scores (Green & Swets, 1966, Ch. 1).
Treating a single-stimulus forced choice as 2AFC: If only one stimulus is presented per trial and the observer picks a label, this is a yes/no design, not 2AFC. Using the 2AFC formula will yield incorrect d' values (Macmillan & Creelman, 2005).
Ignoring extreme rate corrections: Computing d' without correcting H = 1 or FA = 0 produces infinite values. Always apply the log-linear correction (Hautus, 1995).
Assuming equal variance in recognition memory: Recognition memory data almost always show unequal variance (zROC slope ~0.80). Standard d' is not criterion-free in this domain (Ratcliff, Sheu, & Gronlund, 1992).
Interpreting c as "response bias" without checking: c measures where the criterion is placed relative to distributions, not why it is placed there. A shift in c can reflect rational adaptation to base rates, not irrational bias (Macmillan & Creelman, 2005, Ch. 2).
Comparing d' across paradigms without conversion: d' values from yes/no and 2AFC designs are not directly comparable. d'(2AFC) = d'(yes/no) x sqrt(2). Failure to convert leads to erroneous sensitivity comparisons (Green & Swets, 1966, Ch. 6).
Averaging d' across participants without caution: d' is nonlinearly related to H and FA rates. Averaging H and FA rates first, then computing d', gives different results than averaging individual d' values. The appropriate method depends on the research question (Macmillan & Creelman, 2005, Ch. 8).
Based on Macmillan & Creelman (2005) and Stanislaw & Todorov (1999):
See references/sdt-formulas.md for detailed mathematical formulas and lookup tables.
See references/application-guide.md for domain-specific applications.