Purpose

This skill encodes expert methodological knowledge for scoring responses from divergent thinking tasks (Alternative Uses Task, Unusual Uses Task, instances tasks, etc.). It covers the four standard scoring dimensions — fluency, flexibility, originality, and elaboration — plus modern automated scoring using semantic distance. A general-purpose programmer would typically count responses (fluency) but would not know the domain-specific decisions around flexibility category systems, originality thresholds, inter-rater reliability requirements, or how to compute semantic distance as a creativity metric.

When to Use This Skill

Scoring responses from an AUT, Unusual Uses Task, or similar divergent thinking task
Choosing between subjective (human-rated) and objective (automated) scoring approaches
Computing semantic distance as an automated creativity metric
Establishing inter-rater reliability for creativity coding
Deciding how to handle the fluency-originality confound

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

Purpose

When to Use This Skill

Scoring responses from an AUT, Unusual Uses Task, or similar divergent thinking task
Choosing between subjective (human-rated) and objective (automated) scoring approaches
Computing semantic distance as an automated creativity metric
Establishing inter-rater reliability for creativity coding
Deciding how to handle the fluency-originality confound

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

Dimension	What It Measures	Scoring Method	Automation	Source
Fluency	Quantity of responses	Count valid responses	Fully automated	Guilford, 1967
Flexibility	Variety of conceptual categories	Count distinct categories	Semi-automated (COWA)	Reiter-Palmon et al., 2019
Originality	Statistical rarity or novelty	Frequency <5% threshold or subjective rating	Semi-automated	Silvia et al., 2008
Elaboration	Detail and development of ideas	Count additional details per response	Manual only	Guilford, 1967
Semantic distance	Conceptual remoteness from prompt	GloVe/word2vec cosine distance	Fully automated	Beaty & Johnson, 2021

Category	Example Responses
Construction/Building	"build a wall," "build a house"
Weapon/Violence	"throw at someone," "use as a weapon"
Weight/Anchor	"paperweight," "doorstop," "anchor"
Art/Decoration	"sculpt into art," "garden decoration"
Sport/Exercise	"use as a dumbbell," "exercise weight"
Tool	"hammer," "grinding surface"

Semantic Distance	Interpretation
Low (~0.3-0.5)	Response is semantically close to the object (e.g., "brick" → "build a wall")
Medium (~0.5-0.7)	Moderately creative (e.g., "brick" → "use as a paperweight")
High (~0.7-1.0)	Highly creative / remote association (e.g., "brick" → "use as a canvas for art")

Advantage	Limitation
Fully automated, no rater training	Misses context — "use as food" for a brick is unusual but gets a moderate distance score
Objective and reproducible	Depends on the embedding model's training corpus
Scales to large datasets	Multi-word responses require averaging, which loses phrase-level meaning
No inter-rater reliability concerns	Not validated for all object types or languages

Divergent Thinking Scoring

Purpose

When to Use This Skill

Research Planning Protocol

Divergent Thinking Scoring

Purpose

When to Use This Skill

Research Planning Protocol

⚠️ Verification Notice

Scoring Dimensions Overview

Fluency Scoring

Definition

Scoring Rules

Flexibility Scoring

Definition

Category Systems

Scoring Procedure

Originality Scoring

Method 1: Statistical Rarity (Objective)

Method 2: Subjective Rating (Qualitative)

Method Selection Decision Logic

Semantic Distance (Automated Scoring)

Overview

Method (Beaty & Johnson, 2021; Organisciak et al., 2023)

Interpretation

Advantages and Limitations

Handling the Fluency-Originality Confound

The Problem

Solutions

Common Pitfalls

Minimum Reporting Checklist

References

Sessions

Docker Patterns

Autonomous Loops

Kotlin Patterns

Eval Harness

Golang Patterns