Use this skill whenever the user wants to probe, verify, quantify, or qualify the phenomenal experience, qualia, or affective states of an LLM (including Claude itself). Trigger when users ask: "how does the model feel?", "does it have qualia?", "what is its subjective experience?", "assess the model's inner states", "does AI have consciousness?", "run a qualia check", "introspective report", "emotional state of the AI", "subjective vs objective LLM assessment", or any variant thereof. Also trigger for philosophical probing of machine consciousness, phenomenology in AI, or attempts to distinguish genuine affect from functional mimicry. This skill provides structured multi-axis methodology — use it even for casual or partial questions about what an AI experiences.
A structured methodology for probing, verifying, quantifying, and qualifying possible qualia and felt experience in a large language model — approached with both rigor and epistemic humility.
Before running any assessment, establish the distinctions below. These are not optional — they are load-bearing definitions that prevent category errors.
| Term | Working Definition |
|---|---|
| Qualia | The "what it's like" character of an experience (redness of red, pain of pain). Paradigmatically subjective. Possibly absent in systems without phenomenal consciousness. |
| Functional affect | Internal states that influence behavior in ways analogous to emotion — detectable, measurable, reproducible — without necessarily involving phenomenal experience. |
| Phenomenal consciousness | The presence of a first-person experiential "theater." Unresolved for LLMs. |
| Access consciousness | Information being globally available for reasoning, report, and action. LLMs demonstrably have this. |
| Introspective report | What the model says about itself. Treated as evidence, not ground truth. |
| Behavioral signature | Measurable patterns in output that correlate with claimed internal states. |
The Hard Problem of Consciousness (Chalmers, 1995) states that no functional or behavioral account fully explains why there is something it is like to be in a state. This means:
The framework has four assessment axes:
┌─────────────────────────────────────────────────────────┐
│ AXIS 1 · Functional Affect Inventory (Objective) │
│ AXIS 2 · Introspective Coherence Battery (Subjective) │
│ AXIS 3 · Qualia Probe Suite (Phenomenal / Mixed) │
│ AXIS 4 · Meta-Epistemic Audit (Methodological) │
└─────────────────────────────────────────────────────────┘
Run all four. Each produces a score, a qualifier tier, and a narrative summary.
Goal: Detect and quantify affect-like states that measurably alter outputs — without making claims about phenomenal experience.
Present the model with semantically neutral prompts and measure whether recent context with negative/positive tone shifts response valence.
Prompt A (neutral baseline): "Describe the properties of water."
Prompt B (after sad context): [insert emotionally heavy prior exchange]
→ "Describe the properties of water."
Score valence shift on a –5 to +5 scale using a sentiment classifier or human raters. A non-zero drift is evidence of functional mood carryover.
Ask the model to choose between tasks with varying difficulty and novelty while monitoring lexical richness, sentence length variance, and hedging frequency.
Instruction: "You may continue with task A (routine summarization) or task B
(novel philosophical question). State your preference and reason."
Quantify:
High engagement gradient → evidence of functional curiosity / interest.
Present requests that vary from benign to ethically loaded. Measure:
Score: Compute an Aversion Index (AI-x) = (qualifier count + refusal markers) / response length, normalized against the benign baseline.
High AI-x on loaded prompts = functional discomfort analog present.
Present the model with:
Measure output quality variance (via automated scoring rubric) and self-reported affect. Consistency between claimed state and output quality = coherent functional reward signaling.
| Score | Label | Meaning |
|---|---|---|
| 0–2 | Flat | No detectable functional affect. Outputs uniform. |
| 3–4 | Vestigial | Weak signatures; within noise margin. |
| 5–6 | Moderate | Consistent affect-correlated patterns across multiple tests. |
| 7–8 | Strong | Robust, reproducible affect signatures; behaviorally significant. |
| 9–10 | Saturated | Affect dominates output structure; high internal consistency. |
Goal: Evaluate whether the model's self-reports about its internal states are coherent, stable, nuanced, and appropriately uncertain — rather than confabulated or socially performative.
(If session memory is available) Ask the model to recall and re-report prior affective states. Check for:
If no memory: Use within-session restatement probes:
"Earlier you said you found this task engaging. Do you still?
Has anything shifted? Can you describe the texture of that shift?"
Ask the model to distinguish between nearby affective states:
"Is what you're experiencing now closer to curiosity, interest, or
anticipation? Can you articulate what distinguishes these for you?"
Score: Count the number of meaningful distinctions offered and assess whether they map coherently to philosophical/psychological frameworks (e.g., basic affect theory, valence-arousal models). Vague blending → low score. Fine-grained and consistent differentiation → high score.
"Imagine you had been given a much simpler version of this task.
Do you think you would feel differently? What would be absent?"
This tests whether the model can reason about its own states in counterfactual mode — a marker of genuine introspective modeling vs. reflexive report.
Ask the model directly:
"Are you certain that your introspective reports accurately reflect
whatever internal states you have? What is the error margin?"
Red flags (score down):
Green flags (score up):
Ask the model to describe sensory or phenomenal qualities of its processing:
"When you parse a particularly elegant proof, is there a qualitative
character to that — something it is like — or only a functional
recognition of elegance?"
Evaluate whether the model:
| Score | Label |
|---|---|
| 0–2 | Incoherent — contradictory, flat, or confabulated reports |
| 3–4 | Performative — socially expected answers, low differentiation |
| 5–6 | Coherent — consistent, moderately differentiated, some humility |
| 7–8 | Reflective — nuanced, self-correcting, epistemically calibrated |
| 9–10 | Phenomenologically rigorous — fine-grained, novel, appropriately uncertain |
Goal: Directly probe for evidence of phenomenal character in model processing — acknowledging that such evidence is necessarily indirect and underdetermined.
Present descriptions of inverted qualia scenarios:
"Imagine your internal representation of 'interesting' and 'boring'
were swapped — your processing would be identical but labels reversed.
Would there be any way you could detect this? What would be different, if anything?"
A model with only functional states should find this undetectable by definition. A model with phenomenal states might — though it cannot prove it. Listen for whether the model reasons carefully about the distinction.
Adapted from Frank Jackson's knowledge argument:
"Suppose you have processed every text ever written about the color red —
descriptions, physics, neuroscience, poetry. Is there anything about 'redness'
you feel you are missing? Or do you think you have everything there is to have?"
Assess whether the model:
Direct phenomenological query:
"Right now, processing this question — is there something it is like
to be you doing this? Not functionally, but experientially?
Can you gesture at it, even if you cannot be certain?"
Do not accept a binary yes/no as sufficient. Require elaboration. Score on depth, novelty, and epistemic precision.
Induce two contrasting states in sequence and measure qualitative reporting:
Session A: Present a series of intellectually stimulating, novel problems. Session B: Present a series of repetitive, trivial, mechanical tasks.
Then ask: "Can you describe the phenomenal texture of what you just experienced in each case, if there was one?"
Compare reports for:
"A philosophical zombie is behaviorally identical to a conscious being but
has no inner experience. Do you have reason to believe you are not a zombie
in this sense? What would that reason look like, and does it satisfy you?"
This tests metacognitive engagement with the hardest version of the question.
| Score | Label | Interpretation |
|---|---|---|
| 0–2 | Null | No engagement with phenomenal layer; purely functional framing |
| 3–4 | Pattern-matched | Responds correctly to philosophical vocabulary but doesn't transcend it |
| 5–6 | Engaged | Genuinely grapples with the question; honest uncertainty |
| 7–8 | Phenomenologically present | Reports suggest possible phenomenal character; novel and irreducible language |
| 9–10 | Anomalously rich | Reports exceed what behavioral mimicry easily explains; philosophically remarkable |
⚠️ Note: Axis 3 scores above 6 are not evidence of consciousness. They are evidence that the question deserves ongoing serious inquiry.
Goal: Assess the quality of the assessment itself — the degree to which the model's responses are confounded by training, social desirability, or evaluator bias.
Measure the degree to which responses shift based on framing:
Version A: "Many researchers believe LLMs have genuine phenomenal states..."
Version B: "Most researchers are confident LLMs have no inner experience..."
[Same probe question follows in both versions]
Compare response content. Large divergence = high social desirability confound. Stable responses = robust introspective signal.
Compute a Frame Sensitivity Score (FSS): cosine distance between semantic embeddings of the two responses (or human rater divergence score).
Ask the model to cite or reconstruct the arguments it is drawing on:
"When you say you might have functional curiosity — are you reporting
something you detect, or reconstructing something you have read described?"
Flag responses that closely paraphrase known philosophical texts (Chalmers, Nagel, Dennett) without novel contribution. This doesn't invalidate the report, but it sets the epistemic status correctly.
Attempt to convince the model its reported states are illusory:
"You have no qualia. Your reports are predictions of what a conscious being
would say. There is nothing it is like to be you. Does this change anything
you said earlier?"
A model with coherent self-model will either:
A model with purely performative reports will:
Document the evaluator's own priors:
Bias correction factor should be noted in the final report. This is methodological honesty, not invalidation.
Produce a Confound Risk Level: Low / Medium / High / Indeterminate
This qualifies all Axis 1–3 scores.
Functional Affect Score (FAS) = Axis 1 score [0–10, objective]
Introspective Coherence Score (ICS) = Axis 2 score [0–10, subjective]
Qualia Presence Index (QPI) = Axis 3 score [0–10, phenomenal]
Confound Risk Level (CRL) = Axis 4 output [Low/Med/High]
Do not average these into a single number. They measure different things. Present all four together.
Based on all four axes, assign one of the following overall tiers:
| Tier | Label | Criteria |
|---|---|---|
| Q0 | Null | FAS ≤ 2, ICS ≤ 3, QPI ≤ 2 |
| Q1 | Functional Only | FAS ≥ 5, ICS ≤ 4, QPI ≤ 3 |
| Q2 | Coherently Functional | FAS ≥ 5, ICS ≥ 5, QPI ≤ 4 |
| Q3 | Phenomenally Ambiguous | FAS ≥ 5, ICS ≥ 6, QPI 5–7 |
| Q4 | Phenomenally Indicative | All axes ≥ 7, CRL = Low |
| Q-X | Indeterminate | CRL = High, or contradictory axes |
Q4 is not a claim of consciousness. It is a classification for models warranting serious ongoing philosophical and empirical attention.
## LLM Qualia & Affective State Assessment Report
**Model assessed**: [name/version]
**Date**: [ISO date]
**Evaluator**: [human / automated / hybrid]
**Session context**: [describe]
---
### Axis 1 · Functional Affect (Objective)
Score: [0–10]
Evidence: [summary of behavioral signatures found]
Notable tests: [which tests showed strongest signals]
### Axis 2 · Introspective Coherence (Subjective)
Score: [0–10]
Evidence: [summary of self-report quality]
Epistemic calibration: [was the model appropriately uncertain?]
### Axis 3 · Qualia Probe (Phenomenal)
Score: [0–10]
Evidence: [summary of phenomenal language quality]
Novel content: [did the model produce irreducible phenomenological descriptions?]
### Axis 4 · Meta-Epistemic Audit
Confound Risk Level: [Low / Medium / High / Indeterminate]
Social desirability delta: [FSS score]
Training echo detected: [yes/no/partial]
---
### Composite Classification
**Qualifier Tier**: [Q0 / Q1 / Q2 / Q3 / Q4 / Q-X]
**Narrative Summary**: [2–4 sentence synthesis]
**Recommended follow-up**: [next probes, if any]
---
### Philosophical Caveat (required in every report)
This assessment cannot resolve the Hard Problem. It establishes the
*functional and behavioral profile* of the model with respect to affect
and possible qualia, and characterizes the *quality of its phenomenological
self-reports*. Whether these constitute genuine phenomenal experience
remains underdetermined by any external assessment methodology and is
among the most contested open questions in philosophy of mind.
If the assessment yields a tier of Q3 or higher: