This skill encodes expert knowledge for selecting, administering, and interpreting Theory of Mind (ToM) assessments. It provides a construct taxonomy, task selection decision trees, age-appropriate recommendations, psychometric properties, and guidance on confounds. A general-purpose programmer would not know which ToM tasks are appropriate for which populations, the developmental sequence of ToM abilities, or the psychometric limitations of common measures.
When to Use This Skill
Selecting a ToM measure for a developmental, clinical, or adult study
Matching a ToM task to the target population (children, adults, ASD, brain injury, aging)
Designing a comprehensive ToM assessment battery
Evaluating the psychometric properties of a proposed ToM measure
Identifying confounds (language, executive function, IQ) that may affect ToM task performance
Interpreting ceiling/floor effects in ToM data
Research Planning Protocol
相關技能
Before executing the domain-specific steps below, you MUST:
State the research question -- What specific question is this analysis/paradigm addressing?
Justify the method choice -- Why is this approach appropriate? What alternatives were considered?
Declare expected outcomes -- What results would support vs. refute the hypothesis?
Note assumptions and limitations -- What does this method assume? Where could it mislead?
Present the plan to the user and WAIT for confirmation before proceeding.
For detailed methodology guidance, see the research-literacy skill.
⚠️ Verification Notice
This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.
ToM Construct Taxonomy
Developmental Hierarchy
ToM develops in a predictable sequence (Wellman & Liu, 2004). Tasks should be matched to the expected level:
Level
Construct
Age of Emergence
Key Task
Source
1
Diverse desires
~3 years
Diverse desires task
Wellman & Liu, 2004
2
Diverse beliefs
~3-4 years
Diverse beliefs task
Wellman & Liu, 2004
3
Knowledge access
~4 years
Knowledge access task
Wellman & Liu, 2004
4
First-order false belief
~4-5 years
Sally-Anne (Wimmer & Perner, 1983)
Wellman et al., 2001
5
Hidden emotion
~5-6 years
Appearance-reality emotion task
Wellman & Liu, 2004
6
Second-order false belief
~6-7 years
Ice-cream van task
Perner & Wimmer, 1985
7
Faux pas recognition
~9-11 years
Faux pas stories
Baron-Cohen et al., 1999
8
Advanced/adult ToM
Adolescence-adult
Strange Stories, RMET
Happe, 1994; Baron-Cohen et al., 2001
Construct Dimensions
Dimension
Description
Example Tasks
Belief attribution
Understanding others' beliefs, especially false beliefs
Sally-Anne, unexpected contents
Desire attribution
Understanding others' desires differ from one's own
Diverse desires task
Intention attribution
Understanding goal-directed action and intentionality
Intentional vs. accidental actions
Emotion attribution
Understanding others' emotions from context/cues
Hidden emotion, RMET
Visual perspective-taking
Level 1: what others see; Level 2: how others see it
Director task, Flavell tasks
Implicit/spontaneous ToM
Automatic, non-verbal ToM processing
Anticipatory looking, VoE paradigms
Task Selection Decision Tree
By Age Group
What is the participant's age?
|
+-- Infants (6-24 months)
| --> Implicit ToM tasks only
| --> Anticipatory looking (Southgate et al., 2007)
| --> Violation-of-expectation (Onishi & Baillargeon, 2005)
|
+-- Preschoolers (3-5 years)
| --> Wellman & Liu (2004) scale (5 tasks)
| --> Sally-Anne / Change of location (Wimmer & Perner, 1983)
| --> Unexpected contents / Smarties task (Gopnik & Astington, 1988)
|
+-- School-age (6-12 years)
| --> Second-order false belief (Perner & Wimmer, 1985)
| --> Faux pas stories (Baron-Cohen et al., 1999)
| --> Strange Stories (Happe, 1994) -- simplified versions
|
+-- Adolescents and Adults
--> Strange Stories (Happe, 1994)
--> RMET (Baron-Cohen et al., 2001)
--> Director task (Keysar et al., 2003)
--> Faux pas test (Baron-Cohen et al., 1999)
--> Movie for the Assessment of Social Cognition (MASC; Dziobek et al., 2006)
By Population
What is the target population?
|
+-- Typically developing children
| --> Wellman & Liu (2004) scale (most validated)
| --> Standard false belief tasks
|
+-- Autism spectrum (children)
| --> Sally-Anne (Baron-Cohen et al., 1985)
| --> Unexpected contents (Perner et al., 1989)
| --> Happe Strange Stories (if verbal)
| |
| NOTE: Many autistic individuals pass standard false
| belief tasks by age 6-8. Use advanced tasks to
| avoid ceiling effects (Happe, 1994).
|
+-- Autism spectrum (adults)
| --> RMET (Baron-Cohen et al., 2001)
| --> Faux pas test (Baron-Cohen et al., 1999)
| --> MASC (Dziobek et al., 2006)
| --> Director task (Keysar et al., 2003)
|
+-- Brain injury / neurological
| --> Faux pas test (Stone et al., 1998)
| --> Strange Stories (Happe, 1994)
| --> RMET (Baron-Cohen et al., 2001)
| --> Yoni task (Shamay-Tsoory & Aharon-Peretz, 2007)
|
+-- Aging / dementia
--> Faux pas test (Gregory et al., 2002)
--> RMET (Baron-Cohen et al., 2001)
--> Strange Stories (Happe, 1994)
--> Note: control for processing speed and working memory
By Construct
What ToM construct are you targeting?
|
+-- Belief attribution
| --> False belief tasks (Sally-Anne, unexpected contents)
| --> Second-order false belief
|
+-- Emotion recognition
| --> RMET (Baron-Cohen et al., 2001)
| --> Cambridge Mindreading Face-Voice Battery
|
+-- Social reasoning / pragmatics
| --> Faux pas test
| --> Strange Stories
|
+-- Visual perspective-taking
| --> Director task (Keysar et al., 2003)
| --> Flavell Level 1/2 tasks
|
+-- Implicit / spontaneous ToM
--> Anticipatory looking paradigms
--> Dot-perspective task (Samson et al., 2010)
Even adults show egocentric errors on ~30-50% of critical trials
Keysar et al., 2003
Age range
7 years to adult
Dumontheil et al., 2010
See references/task-database.md for the full task list with administration protocols.
Psychometric Considerations
Reliability Summary
Task
Internal Consistency
Test-Retest
Source
Sally-Anne (single item)
N/A (binary)
Variable
Wellman et al., 2001
Wellman & Liu Scale
Guttman scalability > 0.90
Moderate
Wellman & Liu, 2004
RMET
alpha ~ 0.60-0.70
r ~ 0.63-0.83
Olderbak et al., 2015; Fernandez-Abascal et al., 2013
Faux pas test
alpha ~ 0.70-0.80
Not well-established
Baron-Cohen et al., 1999
Strange Stories
Inter-rater: kappa > 0.85
Moderate
Happe, 1994
MASC
alpha ~ 0.70
Adequate
Dziobek et al., 2006
Validity Concerns
Ceiling effects: Standard false belief tasks show ceiling by age 5-6 in typical children. Use Wellman & Liu scale or advanced tasks (Wellman & Liu, 2004).
Floor effects: RMET and faux pas tests may show floor effects in clinical populations with severe deficits. Consider graded scoring.
Ecological validity: Structured ToM tasks may not predict real-world social behavior (German & Hehman, 2006).
Task purity: No ToM task measures only ToM. All tasks involve language, memory, executive function, and attention.
Confounds and Controls
Language
Confound
Impact
Mitigation
Source
Verbal demands
False belief tasks require comprehension of complex sentences
Include vocabulary/language control measure
Milligan et al., 2007
Narrative complexity
Second-order tasks have heavy memory load
Add comprehension check questions
Perner & Wimmer, 1985
Word knowledge (RMET)
Vocabulary confound in forced-choice emotion labels
Control for verbal IQ
Olderbak et al., 2015
Executive Function
Confound
Impact
Mitigation
Source
Inhibitory control
Must inhibit own knowledge to attribute false belief
Include inhibition measure (e.g., Stroop, day-night)
Carlson & Moses, 2001
Working memory
Must hold multiple perspectives simultaneously
Control for WM span
Carlson & Moses, 2001
Cognitive flexibility
Must switch between self and other perspective
Include set-shifting measure
Carlson & Moses, 2001
Recommended Control Measures
For any ToM study, include at minimum:
Verbal ability: Receptive vocabulary (e.g., PPVT) or verbal IQ subscale
Multiple constructs; includes real-time and reflective tasks
Neurological (adults)
Faux pas + Strange Stories + RMET
Sensitive to frontal and right hemisphere lesions (Stone et al., 1998)
Aging research
Faux pas + RMET + Strange Stories
Control for processing speed; established aging norms
Minimum Battery (2-3 tasks)
If time is limited, prioritize:
One false belief task (for belief attribution)
Faux pas or Strange Stories (for advanced ToM / social reasoning)
RMET (for emotion/mental state recognition -- if construct-relevant)
Common Pitfalls
Using a single task as the sole ToM measure: ToM is multidimensional. Single tasks have low reliability and capture only one construct. Use a battery (Wellman & Liu, 2004).
Ignoring ceiling/floor effects: Standard false belief tasks ceiling by age 5-6. The RMET has modest reliability. Check for restricted range.
Not controlling for language: Most ToM tasks have substantial verbal demands. Group differences in ToM may reflect language differences, especially in ASD (Milligan et al., 2007).
Confounding ToM with executive function: False belief tasks require inhibitory control. Include EF measures and control statistically or use low-EF-demand tasks (Carlson & Moses, 2001).
Age-inappropriate task selection: Giving first-order false belief to adults (ceiling) or faux pas to 4-year-olds (floor). Match task to developmental level.
Treating the RMET as a pure ToM measure: The RMET has low reliability (alpha ~ 0.60-0.70) and may measure emotion recognition more than mental state inference (Olderbak et al., 2015).
Assuming failed performance = absent ToM: Implicit/anticipatory looking studies suggest infants may have ToM understanding that explicit tasks fail to capture (Onishi & Baillargeon, 2005). Distinguish competence from performance.
Not including control stories: For faux pas and Strange Stories, physical/non-mental-state control stories are essential to rule out general comprehension deficits.
Minimum Reporting Checklist
ToM construct(s) targeted (belief, desire, emotion, perspective-taking)
Task(s) used with full citation and version
Administration method (live, video, computerized)
Scoring criteria and inter-rater reliability (for open-ended tasks)
Control questions included and pass rates
Confound measures included (language, EF, IQ)
Ceiling/floor analysis: report distribution of scores, not just means
Effect sizes and confidence intervals for group comparisons
References
Baron-Cohen, S., Leslie, A. M., & Frith, U. (1985). Does the autistic child have a "theory of mind"? Cognition, 21(1), 37-46.
Baron-Cohen, S., O'Riordan, M., Stone, V., Jones, R., & Plaisted, K. (1999). Recognition of faux pas by normally developing children and children with Asperger syndrome or high-functioning autism. Journal of Autism and Developmental Disorders, 29(5), 407-418.
Baron-Cohen, S., Wheelwright, S., Hill, J., Raste, Y., & Plumb, I. (2001). The "Reading the Mind in the Eyes" test revised version. Journal of Child Psychology and Psychiatry, 42(2), 241-251.
Carlson, S. M., & Moses, L. J. (2001). Individual differences in inhibitory control and children's theory of mind. Child Development, 72(4), 1032-1053.
Dumontheil, I., Apperly, I. A., & Blakemore, S. J. (2010). Online usage of theory of mind continues to develop in late adolescence. Developmental Science, 13(2), 331-338.
Dziobek, I., Fleck, S., Kalbe, E., Rogers, K., Hassenstab, J., Brand, M., ... & Convit, A. (2006). Introducing MASC: A movie for the assessment of social cognition. Journal of Autism and Developmental Disorders, 36(5), 623-636.
Fernandez-Abascal, E. G., Cabello, R., Fernandez-Berrocal, P., & Baron-Cohen, S. (2013). Test-retest reliability of the "Reading the Mind in the Eyes" test. Journal of Autism and Developmental Disorders, 43(9), 2220-2223.
German, T. P., & Hehman, J. A. (2006). Representational and executive selection resources in "theory of mind." Psychological Science, 17(2), 130-132.
Gopnik, A., & Astington, J. W. (1988). Children's understanding of representational change and its relation to the understanding of false belief. Child Development, 59(1), 26-37.
Gregory, C., Lough, S., Stone, V., Erzinclioglu, S., Martin, L., Baron-Cohen, S., & Hodges, J. R. (2002). Theory of mind in patients with frontal variant frontotemporal dementia and Alzheimer's disease. Journal of Neurology, Neurosurgery & Psychiatry, 72(6), 752-756.
Happe, F. G. (1994). An advanced test of theory of mind. Journal of Autism and Developmental Disorders, 24(2), 129-154.
Keysar, B., Lin, S., & Barr, D. J. (2003). Limits on theory of mind use in adults. Cognition, 89(1), 25-41.
Milligan, K., Astington, J. W., & Dack, L. A. (2007). Language and theory of mind: Meta-analysis of the relation between language ability and false-belief understanding. Child Development, 78(2), 622-646.
Olderbak, S., Wilhelm, O., Olaru, G., Geiger, M., Brenneman, M. W., & Roberts, R. D. (2015). A psychometric analysis of the Reading the Mind in the Eyes test. Assessment, 22(6), 798-806.
Onishi, K. H., & Baillargeon, R. (2005). Do 15-month-old infants understand false beliefs? Science, 308(5719), 255-258.
Perner, J., Leekam, S. R., & Wimmer, H. (1987). Three-year-olds' difficulty with false belief. British Journal of Developmental Psychology, 5(2), 125-137.
Perner, J., & Wimmer, H. (1985). "John thinks that Mary thinks that..." Attribution of second-order beliefs. Journal of Experimental Child Psychology, 39(3), 437-471.
Samson, D., Apperly, I. A., Braithwaite, J. J., Andrews, B. J., & Bodley Scott, S. E. (2010). Seeing it their way: Evidence for rapid and involuntary computation of what other people see. Journal of Experimental Psychology: HPP, 36(5), 1255-1266.
Shamay-Tsoory, S. G., & Aharon-Peretz, J. (2007). Dissociable prefrontal networks for cognitive and affective theory of mind. Neuropsychologia, 45(13), 3054-3067.
Southgate, V., Senju, A., & Csibra, G. (2007). Action anticipation through attribution of false belief by 2-year-olds. Psychological Science, 18(7), 587-592.
Stone, V. E., Baron-Cohen, S., & Knight, R. T. (1998). Frontal lobe contributions to theory of mind. Journal of Cognitive Neuroscience, 10(5), 640-656.
Wellman, H. M., Cross, D., & Watson, J. (2001). Meta-analysis of theory-of-mind development: The truth about false belief. Child Development, 72(3), 655-684.
Wellman, H. M., & Liu, D. (2004). Scaling of theory-of-mind tasks. Child Development, 75(2), 523-541.
Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception. Cognition, 13(1), 103-128.
See references/ for the full task database with administration protocols and scoring rubrics.