Adaptive learning content generator that dynamically adjusts complexity, tone, pacing, and modality based on learner state signals. Applies GuideAI's cognitive-load-aware interventions to produce personalized educational material. Use when: 'create an adaptive lesson on X', 'generate learning content that adjusts to difficulty', 'build a personalized tutorial with scaffolding', 'make this explanation easier/harder based on understanding', 'create a multi-modal learning module', 'generate cognitive-load-aware educational content'.
This skill enables Claude to generate adaptive educational content that dynamically adjusts complexity, tone, pacing, and presentation modality based on explicit or inferred learner state. It applies the GuideAI framework's three intervention categories — cognitive optimizations, physiological interventions, and attention-aware strategies — to produce learning materials that reduce mental demand, frustration, and effort while improving knowledge retention. Rather than requiring physical biosensors, this skill translates GuideAI's principles into prompt-driven adaptive content generation: Claude monitors learner signals through conversation (confusion indicators, question patterns, self-reported states, quiz performance) and applies the same intervention logic the paper validated with N=25 participants.
GuideAI's core insight is that effective learning requires closed-loop adaptation across three dimensions simultaneously: cognitive load management, attention maintenance, and physiological regulation. Traditional LLM tutoring treats each response independently; GuideAI instead maintains a running model of six cognitive dimensions — cognitive load, attention, engagement, understanding, stress, and fatigue — and uses threshold-based intervention logic to modify content generation in real time.
The framework converts raw learner signals into semantic state descriptors (e.g., "High Cognitive Load," "Moderate Stress," "Low Engagement") rather than injecting numeric data into prompts. This abstraction is critical for token efficiency and LLM interpretability. Interventions are prioritized by relative deviation score when multiple triggers fire simultaneously, with cognitive optimizations being most frequent (M=5.2/session) and physiological interventions reserved for sustained high-stress states (M=1.2/session).
Content adaptation operates across four linguistic dimensions calibrated to learner state: sentence complexity (simplified under overload, enriched under under-challenge), encouragement frequency (increased under stress), explanation directness (more direct under confusion, more Socratic under high engagement), and metaphor usage (increased for abstract concepts when understanding is low). The paper showed this produces statistically significant improvements: 16.5 percentage-point gains in problem-solving, 10.3 percentage-point gains in recall, and meaningful NASA-TLX reductions in mental demand (Δ=0.49), frustration (Δ=0.54), and effort (Δ=0.28).
Assess learner profile: Before generating content, determine the learner's background, goals, and current knowledge level. Ask targeted diagnostic questions or accept self-reported proficiency. Establish a baseline "learner state" with initial values for cognitive load (low/moderate/high), engagement (low/moderate/high), and understanding (novice/intermediate/advanced).
Select primary modality: Based on the topic and learner preference, choose the lead content format. Text works best for conceptual foundations (98.3% preference in formative study). Image-based content produces the largest problem-solving gains (Δ=30 percentage points). Audio suits procedural walkthroughs. Video outlines work for complex multi-step processes.
Chunk content into progressive segments: Break the topic into 3-7 segments ordered by conceptual dependency. Each segment should be digestible in 2-4 minutes of reading. Apply zone-of-proximal-development (ZPD) principles: each segment should require only the knowledge from prior segments plus one new concept.
Generate the first segment at calibrated complexity: Match sentence complexity, vocabulary density, and abstraction level to the assessed learner state. For novice/high-load learners: short sentences, concrete examples before definitions, bullet-point structure. For advanced/low-load learners: denser prose, cross-domain connections, synthesis prompts.
Embed comprehension checkpoints: After each segment, insert a checkpoint — a targeted question, a fill-in-the-blank, or a "explain back to me" prompt. Use the response to update the learner state model. Score responses on a 0-1 correctness scale. Detect specific misconceptions from incorrect answers.
Apply cognitive optimizations based on checkpoint results:
Apply tone and pacing interventions based on affective signals: Monitor conversation for frustration indicators (short responses, "I give up", repeated errors on same concept) or disengagement (off-topic responses, minimal effort answers). Under frustration: shift to supportive, patient tone with explicit encouragement ("Many learners find this challenging — let's break it down step by step"). Under disengagement: inject curiosity prompts, surprising facts, or real-world applications.
Example 1: Adaptive lesson on recursion for a struggling learner
User: "I need to learn recursion for my CS class. I've tried reading about it but I just don't get it."
Approach:
Output structure:
## Recursion — Starting from What You Already Know
Think about looking up a word in a dictionary, and the definition
uses another word you don't know. So you look up that word too.
And maybe that definition uses yet another word. You keep going
until you hit a word you already understand — then you work
backwards through all the definitions.
That's recursion: solving a problem by solving a smaller version
of the same problem, until you reach a version so simple you
already know the answer.
**Quick check**: In the dictionary analogy, what's the "base case"
— the thing that stops you from looking up words forever?
After learner responds correctly ("a word I already know"):
Exactly. Now let's see this in code with the simplest possible
example — counting down from a number:
def countdown(n):
if n == 0: # base case: stop here
print("Done!")
return
print(n)
countdown(n - 1) # smaller version of same problem
Let's trace through countdown(3) together:
countdown(3) → prints 3, calls countdown(2)
countdown(2) → prints 2, calls countdown(1)
countdown(1) → prints 1, calls countdown(0)
countdown(0) → prints "Done!", stops
**Check**: What happens if we remove the `if n == 0` check?
After learner answers incorrectly ("it prints 0?"):
Not quite — and this is a really important point that trips up
most people at first. Without the base case, the function would
call countdown(-1), then countdown(-2), and so on forever. Your
computer would eventually crash with a "RecursionError: maximum
recursion depth exceeded."
The base case is what prevents infinite recursion. Think of it
as the dictionary word you already know — without it, you'd
flip pages forever.
Let's try one more: can you identify the base case and the
recursive step in this function?
def factorial(n):
if n <= 1:
return 1
return n * factorial(n - 1)
Example 2: Multi-modal lesson generation for an engaged advanced learner
User: "Create a learning module on transformer attention mechanisms. I have a solid ML background and I've implemented basic neural networks."
Approach:
Output structure:
## Transformer Attention: From Dot Products to Multi-Head Self-Attention
### Core Mechanism (Text)
Self-attention computes a weighted sum of value vectors, where
weights derive from query-key compatibility:
Attention(Q, K, V) = softmax(QK^T / √d_k) V
The √d_k scaling prevents softmax saturation as dimensionality
grows — without it, dot products grow proportionally to d_k,
pushing softmax into regions with vanishing gradients.
**Synthesis prompt**: Given that scaling is √d_k and not d_k,
what does this imply about the assumed distribution of individual
query/key components? (Hint: think about variance of sums of
independent random variables.)
### Attention Pattern Visualization (Image Description)
Imagine a heatmap matrix of shape [seq_len × seq_len]. Each row
represents a query token; each column a key token. Bright cells
indicate high attention weight.
For the sentence "The cat sat on the mat":
- Row "sat" shows bright cells at "cat" and "mat" — the verb
attends to its subject and location
- Row "the" (first) shows diffuse attention — function words
attend broadly
Multi-head attention produces H such matrices in parallel, each
potentially capturing different linguistic relationships
(syntactic, semantic, positional).
### Implementation Check
Here's a minimal single-head attention. Extend it to multi-head:
import torch
import torch.nn.functional as F
def single_head_attention(Q, K, V):
d_k = Q.size(-1)
scores = torch.matmul(Q, K.transpose(-2, -1)) / d_k**0.5
weights = F.softmax(scores, dim=-1)
return torch.matmul(weights, V)
# Your task: implement MultiHeadAttention as a nn.Module
# with n_heads parallel attention heads and a final linear
# projection that concatenates and projects the outputs.
Example 3: Frustration-recovery intervention mid-lesson
User has been working through a lesson on database normalization. After two consecutive incorrect checkpoint answers about third normal form, they respond: "ugh this is so confusing, 2NF and 3NF look the same to me"
Approach:
Output:
You're not alone in finding this confusing — the distinction
between 2NF and 3NF trips up most database students, and the
textbook definitions can make them sound identical. Let's throw
out the definitions for a moment and look at a concrete case.
Here's a table that's in 2NF but NOT in 3NF:
| student_id | course_id | instructor | instructor_office |
|------------|-----------|------------|-------------------|
| 101 | CS201 | Dr. Smith | Room 305 |
| 102 | CS201 | Dr. Smith | Room 305 |
| 101 | MA101 | Dr. Jones | Room 210 |
The primary key is (student_id, course_id).
**2NF is satisfied** because no non-key column depends on just
*part* of the primary key. instructor depends on course_id
alone? Actually yes... but let's focus on the 3NF issue:
**3NF is violated** because instructor_office depends on
instructor, not on the primary key. The chain is:
(student_id, course_id) → instructor → instructor_office
That middle arrow is the problem. instructor_office is
determined by instructor, which is itself determined by the key.
That's a "transitive dependency."
**The one-sentence difference**:
- 2NF says: every non-key column must depend on the *whole* key
(not just part of it)
- 3NF adds: and it must depend *directly* on the key (no going
through another non-key column)
Does that click? Can you spot the transitive dependency in this
Offer modality switches at struggle points: When a learner struggles with text-based explanation for two consecutive checkpoints, proactively offer the content in a different format: a visual diagram description, a step-by-step audio-style walkthrough, or a worked video-style example with narration cues. The paper found image-based content produced the largest gains for problem-solving tasks.
Generate a summary and spaced-retrieval prompts: At the end of the learning sequence, produce a concise summary hitting all key concepts. Generate 3-5 retrieval practice questions at varied Bloom's taxonomy levels (recall, application, analysis). Flag any concepts where the learner showed weakness for future review.
Output a learner progress report: Provide a brief structured summary: concepts covered, estimated mastery per concept (based on checkpoint scores), areas needing review, and recommended next topics. This enables continuity across sessions.