Transform raw dialogues into Cambridge IGCSE-quality blanks with 20-30% density, strategic verb/idiom targeting, validated alternatives, and IELTS-focused insights.
Transform poorly-blanked dialogue transcripts into Cambridge IGCSE-quality IELTS practice materials with strategic linguistic analysis, validated alternatives, and pedagogically valuable insights.
Current Tool 2 Issues:
This Skill Delivers:
/linguistic-blank-inserter "examples/raw_conversation.json" "25" "VERB,ADJ,ADV,IDIOM,EXPRESSION" "B2" "3" "standard" "yes" "yes"
What happens:
/linguistic-blank-inserter "data/business_dialogue.json" "28" "VERB,IDIOM,EXPRESSION,COLLOCATION" "B2" "4" "strict" "yes" "yes"
Focus: Higher blank density (28%), idiom-heavy (business English), 4 alternatives per blank, strict validation mode, full deep dive insights.
/linguistic-blank-inserter "data/conversation.json" "22" "VERB,ADJ,ADV" "A2" "3" "lenient" "yes" "no"
Focus: Lower density (22%), A2 beginner level, quick turnaround, auto-fix enabled, skip deep dive (faster processing).
Dialogue File (JSON): RoleplayScript with raw dialogue turns
{
"dialogue": [
{"speaker": "Jessica", "text": "Welcome back. It's Jessica here."},
{"speaker": "Customer", "text": "Yes, I'm trying to make a cake..."}
]
}
Target Blank Density: 20-30% (default: 25%)
Grammar Focus Types: Select from:
CEFR Difficulty Level: A1 (beginner) → C2 (mastery)
Validation Strictness:
File: [input-filename]-blanked-[TIMESTAMP].json
Structure (expanded RoleplayScript):
{
"id": "original-id",
"dialogue": [
{
"speaker": "Jessica",
"text": "Welcome back. It's Jessica here. Welcome to my channel."
},
{
"speaker": "Customer",
"text": "Yes, I'm ________ to make a cake, but I'm ________ a few things."
},
{
"speaker": "Shop Assistant",
"text": "Yes, you ________ it right—flour!"
}
],
"answerVariations": [
{
"index": 0,
"answer": "trying",
"alternatives": ["attempting", "planning", "wanting", "hoping"],
"confidence": "HIGH",
"pos": "VERB",
"cefr_level": "B1"
},
{
"index": 1,
"answer": "missing",
"alternatives": ["lacking", "short of", "needing", "without"],
"confidence": "HIGH",
"pos": "VERB",
"cefr_level": "B1"
},
{
"index": 2,
"answer": "got",
"alternatives": ["guessed", "named", "identified", "said"],
"confidence": "MEDIUM",
"pos": "VERB",
"cefr_level": "A1"
}
],
"deepDive": [
{
"index": 0,
"phrase": "trying to",
"grammar_type": "MODAL_INFINITIVE",
"explanation": "Modal + infinitive expressing ongoing effort toward a goal",
"usage_context": "Common in IELTS Speaking when describing current activities or intentions",
"collocations": ["trying to understand", "trying to achieve", "trying to explain"],
"ielts_relevance": "Band 6-7 (intermediate-upper intermediate speaking)",
"common_errors": "❌ 'trying for' (incorrect), ✓ 'trying to' (correct)",
"example": "I'm trying to improve my English speaking skills."
},
{
"index": 1,
"phrase": "missing",
"grammar_type": "VERB_PRESENT_CONTINUOUS",
"explanation": "Verb meaning 'lacking/not having' used in present continuous",
"usage_context": "Common in spoken English describing immediate lack or absence",
"collocations": ["missing + object", "missing something", "missing some items"],
"ielts_relevance": "Band 5-6 (intermediate speaking/writing)",
"common_errors": "❌ Confusing 'missing' (lack) with 'miss' (feel absence of person)",
"example": "I'm missing some flour for the cake recipe."
}
],
"metadata": {
"blank_density_target": 0.25,
"blank_density_achieved": 0.27,
"total_blanks_inserted": 9,
"grammar_distribution": {
"VERB": 6,
"IDIOM": 1,
"EXPRESSION": 2
},
"locked_chunks_compliance": 0.85,
"validation_status": "PASS",
"high_confidence_fixes_applied": 2,
"medium_confidence_issues": 1,
"low_confidence_warnings": 0,
"processing_time_seconds": 2.3
}
}
Validation Report (if strictness=strict):
[input-filename]-validation-[TIMESTAMP].mdAudit Log (optional):
[input-filename]-audit-[TIMESTAMP].jsonInput: Raw dialogue turns Process:
Output: Enriched dialogue with linguistic metadata
Input: Linguistic metadata for each word/phrase Scoring Logic:
Total Score = Grammar Value (40%) + LOCKED_CHUNKS Match (30%) +
Difficulty Calibration (15%) + Pedagogical Value (15%)
Grammar Value:
• Verbs (especially modal + infinitive): +40
• Phrasal verbs: +35
• Idioms/expressions: +35
• Adjectives/Adverbs: +25
• Collocations: +20
LOCKED_CHUNKS Match (Cambridge Corpus Alignment):
• Bucket A: +30 (highest frequency, formal academic)
• Bucket B: +20 (high frequency, general English)
• Other: +5
Difficulty Calibration:
• CEFR match ±1 level: +15
• CEFR match ±2 levels: +5
• CEFR mismatch >2 levels: -10
Pedagogical Value:
• Common learner error: +20
• Multiple meanings: +10
• Common in IELTS: +15
• Position (sentence start): +5
• Position (sentence end): +3
Penalties:
• Too short (<3 chars): -20
• Too long (>20 chars): -10
• Adjacent to blank: -15
Output: Scored candidates (0-100 scale)
Input: Scored candidates Constraints:
Selection Strategy:
Output: Ordered list of blanks with indices
Input: Selected blanks Multi-Strategy Approach:
Variation Mappings (FluentStep predefined):
Grammatical Variants:
Synonym Matching (WordNet + custom lexicon):
Collocation-based:
Validation Gates:
Minimum: 3 alternatives per blank Target: 4-5 alternatives per blank
Fallback Generation (if <3 validated):
Output: 3-5 validated alternatives per blank
Input: Selected blanks with alternatives Insight Components (for 30-40% of blanks):
Grammar Explanation:
Usage Context:
Collocations:
IELTS Relevance:
Common Learner Errors:
Output: JSON deepDive array (3-4 insights per script)
Before outputting blanked script, verify all items:
Trigger: You receive poorly-blanked dialogue from YouTube Transcript Extractor Tool 2
Workflow:
1. Extract raw dialogue from Tool 2 output
2. Run /linguistic-blank-inserter with:
- target_blank_density: 25 (override Tool 2's 9%)
- focus_types: VERB,IDIOM,EXPRESSION (override random)
- difficulty_level: B2 (IELTS target)
- strictness: standard (reasonable validation)
- enable_auto_fix: yes (fix HIGH confidence items)
3. Review output (1-2 minutes)
4. Load into FluentStep for testing
5. Deploy to learners
Time saved: 3.75 hours → 15 minutes (93% reduction)
Trigger: You have a raw transcript and want to create blanks manually
Workflow:
1. Extract transcript to JSON (RoleplayScript format)
2. Run /linguistic-blank-inserter with default params
3. Review 30% of blanks spot-check
4. Accept recommendations
5. Load into FluentStep
Time saved: 4 hours → 10 minutes (96% reduction)
Trigger: You have 20 raw transcripts needing blanks
Workflow:
for file in transcripts/*.json; do
/linguistic-blank-inserter "$file" "25" "VERB,ADJ,ADV,IDIOM" "B2" "3" "standard" "yes" "yes"
done
Time saved: 80 hours → 5-6 hours total (93% reduction)
Trigger: Creating official IGCSE practice materials
Workflow:
/linguistic-blank-inserter exam_dialogue.json "28" "VERB,IDIOM,EXPRESSION,COLLOCATION" "B2" "4" "strict" "yes" "yes"
Validation: Requires ≥90% LOCKED_CHUNKS compliance before output
{
"dialogue": [
{"speaker": "Jessica", "text": "________. ________ here. ________ channel."},
{"speaker": "Customer", "text": "Yes, I'm trying to make a cake..."},
{"speaker": "Shop Assistant", "text": "Yes, you got it right—flour!"}
],
"answerVariations": [
{"index": 0, "answer": "Welcome back", "alternatives": []},
{"index": 1, "answer": "It's Jessica", "alternatives": []},
{"index": 2, "answer": "Welcome to my", "alternatives": []}
]
}
Issues:
{
"dialogue": [
{"speaker": "Jessica", "text": "Welcome back. It's Jessica here. Welcome to my channel."},
{"speaker": "Customer", "text": "Yes, I'm ________ to make a cake, but I'm ________ a few things."},
{"speaker": "Shop Assistant", "text": "Yes, you ________ it right—flour!"}
],
"answerVariations": [
{
"index": 0,
"answer": "trying",
"alternatives": ["attempting", "planning", "wanting", "hoping"],
"confidence": "HIGH",
"pos": "VERB",
"cefr_level": "B1"
},
{
"index": 1,
"answer": "missing",
"alternatives": ["lacking", "short of", "needing", "without"],
"confidence": "HIGH",
"pos": "VERB",
"cefr_level": "B1"
},
{
"index": 2,
"answer": "got",
"alternatives": ["guessed", "named", "identified", "said"],
"confidence": "MEDIUM",
"pos": "VERB",
"cefr_level": "A1"
}
],
"deepDive": [
{
"index": 0,
"phrase": "trying to",
"grammar_type": "MODAL_INFINITIVE",
"explanation": "Modal + infinitive expressing ongoing effort toward goal. Essential for describing intentions and current efforts in English.",
"usage_context": "Very common in IELTS Speaking Part 3 (abstract topics). Used when describing what people are attempting to do.",
"collocations": ["trying to understand", "trying to achieve", "trying to explain", "trying to improve"],
"ielts_relevance": "Band 6-7 (intermediate-upper intermediate). Demonstrates understanding of auxiliary verbs.",
"common_errors": "❌ 'trying for' (incorrect), ❌ 'try to' in past (must be 'tried to'), ✓ 'trying to' (correct)",
"example": "I'm trying to improve my English speaking skills. | They're trying to make the project successful."
},
{
"index": 1,
"phrase": "missing",
"grammar_type": "VERB_PRESENT_CONTINUOUS",
"explanation": "Present continuous of 'miss' (lack/not have) - different from 'miss' (feel absence of person). Context determines meaning.",
"usage_context": "Common in shopping contexts and problem-solving discussions. Spoken English preference.",
"collocations": ["missing + object", "missing some items", "missing a few things", "missing an ingredient"],
"ielts_relevance": "Band 5-6 (intermediate). Shows vocabulary range in practical situations.",
"common_errors": "❌ Confusing 'missing' (lack) with 'miss' (feel absence) - 'I'm missing my family' means I don't have them with me, 'I'm missing flour' means I don't have flour",
"example": "I'm missing some flour for the cake. | They're missing key information for the decision."
}
],
"metadata": {
"blank_density_target": 0.25,
"blank_density_achieved": 0.27,
"total_blanks_inserted": 9,
"grammar_distribution": {
"VERB": 6,
"IDIOM": 1,
"EXPRESSION": 2
},
"locked_chunks_compliance": 0.85,
"validation_status": "PASS",
"auto_fixed_high": 2,
"flagged_medium": 1,
"warnings_low": 0,
"processing_time_seconds": 2.3
}
}
Improvements:
Non-English Dialogues
Technical/Highly Specialized Content
Very Short Dialogues (<10 turns)
Pre-blanked Content (already has blanks)
Non-Pedagoical Content (marketing, narratives, creative writing)
IELTS Speaking Practice Materials
Cambridge IGCSE English Preparation
EFL/ESL Textbooks
Conversation-Heavy Learning Paths
A successfully blanked script should meet ALL of these:
✅ Density: 20-30% (±2% acceptable) ✅ Grammar Focus: ≥60% verbs/idioms/expressions ✅ Alternatives: 3-5 per blank, 100% validated ✅ Cambridge Alignment: ≥80% LOCKED_CHUNKS compliance ✅ Difficulty: Matches target CEFR level (±1 acceptable) ✅ British English: 100% compliance (spelling, vocabulary) ✅ Register: Consistent throughout dialogue ✅ Technical: Valid JSON, no errors, ready for production ✅ IELTS Insights: 30-40% of blanks with deep dive ✅ Confidence: Clear scoring (HIGH/MEDIUM/LOW)
Manual Blank Insertion Process:
With /linguistic-blank-inserter:
Savings per dialogue: 2.25 hours (90% reduction) Across 20 dialogues: 45 hours saved Across 100 dialogues: 225 hours saved
Cause: Dialogue too short or insufficient candidates Fix:
min-alternatives threshold (might reject more blanks)focus-types to include ADJ/ADVCause: POS mismatch or semantic similarity threshold too high Fix:
Cause: Dialogue uses non-standard or specialized vocabulary Fix:
target-blank-density to select from higher-scoring candidatesdifficulty-level (some vocabulary A2-specific, not B2)Cause: include-deep-dive set to "no"
Fix: Set include-deep-dive to "yes"
Status: Production Ready ✅ Last Updated: 2025-02-08 Cambridge Aligned: Yes (IGCSE + IELTS focus) Time Savings: 90% reduction (2.5 hrs → 15 min per dialogue)