Validate JLPT N5 Listening Mondai 3 (発話表現 / selectAudio) derived-data.json files. Performs 6 passes: (1) mechanical schema validation via script, (2) semantic/linguistic N5-level review, (3) TTS script generation, (4) Imagen 3 image generation, (5) Gemini TTS audio generation, (6) final question.json build. Use when the user asks to 'selectAudio test et', 'derived-data doğrula', 'expression question test', or after generating a question with the select-audio-creator skill.
Validate a derived-data.json for the selectAudio module using a six-pass pipeline: mechanical validation, semantic review, TTS script generation, Imagen 3 image generation, Gemini TTS audio generation, and final question.json build.
backend/listening/data/selectAudio/
{id}/
derived-data.json ← Input (produced by select-audio-creator skill)
The user specifies either a folder ID (e.g. 002) or a full path.
python3 .agents/skills/jlpt-n5-listening-select-audio-tester/scripts/validate_derived_data.py \
backend/listening/data/selectAudio/{id}/derived-data.json
| # | Check | Rule |
|---|
| 1 | source_clip | Present and non-empty string |
| 2 | metadata.level | "N5" (uppercase) |
| 3 | metadata.topic | Object with ja + all 6 UI lang keys (tr, en, de, fr, es, ko) |
| 4 | correct_option | Integer 0, 1, or 2 |
| 5 | transcriptions.ja.intro | Present, >10 chars, ends with か。 |
| 6 | transcriptions.ja.options | Exactly 3 items; each has number (1/2/3) and non-empty text |
| 7 | transcriptions.ja.question | Present, >5 chars, ends with か。 |
| 8 | All 7 lang keys present in transcriptions | ja + tr + en + de + fr + es + ko |
| 9 | analysis.vocabulary | ≥2 items; each has word, reading, meanings (object) |
| 10 | image_prompt | Present and non-empty string |
If any FAIL: Stop. Report errors. Fix the JSON. Re-run Pass 1. If PASS: Proceed to Pass 2.
Read derived-data.json and check each criterion:
transcriptions.ja.intro, options[].text, and question must use N5 vocabulary and grammar.transcriptions.ja.options[correct_option].text must be the natural, appropriate thing for the arrow character to say in the described situation.transcriptions.tr/en/de/fr/es/ko must accurately reflect the Japanese intro, options, and question.logic.tr and logic.en must explain why the correct option is right.Pass 2 — Semantic Review for {id}
2.1 N5 Level: ✅ PASS / ❌ FAIL (details)
2.2 Correct Option: ✅ PASS / ❌ FAIL (details)
2.3 Distractor Quality: ✅ PASS / ❌ FAIL (trap types identified)
2.4 Naturalness: ✅ PASS / ⚠️ WARN (details)
2.5 Translations: ✅ PASS / ❌ FAIL (details)
2.6 Logic Quality: ✅ PASS / ❌ FAIL (details)
If any 2.1–2.3 FAIL: Stop. Report issues. Suggest fixes. Re-run after fixing. If PASS or only WARNs: Proceed to Pass 3.
Generate tts_script.json in the same folder.
Determine the arrow character's gender by reading the image (or ask the user if ambiguous):
| Arrow Character | Voice ID | Gemini Voice |
|---|---|---|
| Narrator (intro + question) | Intro_Voice | Kore |
| Male character | Male_1 | Puck |
| Female character | Female_1 | Zephyr |
{
"tts_script": [
{ "voice": "Intro_Voice", "text": "<transcriptions.ja.intro>" },
{ "break": "1s" },
{ "voice": "<Speaker_Voice>", "text": "1 <options[0].text>" },
{ "break": "0.8s" },
{ "voice": "<Speaker_Voice>", "text": "2 <options[1].text>" },
{ "break": "0.8s" },
{ "voice": "<Speaker_Voice>", "text": "3 <options[2].text>" },
{ "break": "1s" },
{ "voice": "Intro_Voice", "text": "<transcriptions.ja.question>" }
]
}
Intro_Voice reads the intro and the question repeat.1s break after intro.Speaker_Voice (Male_1 or Female_1) reads ALL 3 options — they all come from the same character.1 , 2 , 3 (full-width digits + space).0.8s break between each option.1s break before question repeat.{voice, text} OR {break}, never both.Save to: backend/listening/data/selectAudio/{id}/tts_script.json
After saving: Display the script to the user.
NOTE: The arrow is NOT added by this pass. The user adds it manually after image generation.
Read image_prompt from derived-data.json.
STYLE RULE (MANDATORY): All images must use the approved manga/textbook style. The
image_promptinderived-data.jsonshould already follow this style. If it doesn't, rewrite it to include these required keywords:
Simple manga-style line drawing, black and whitesimple round head(for each character)Cartoon style with simple round facesclean outlines, white background, no shading, no color, no text, no arrowsSame style as JLPT N5 textbook manga illustrationsAVOID:
JLPT exam textbook line drawing→ produces realistic/Western faces (wrong style).
Use the generate_image script pattern adapted for selectAudio:
python3 -c "
import os, json, base64
from pathlib import Path