Convert a YouTube tutorial video into durable, referenceable learning artifacts stored under UA_ARTIFACTS_DIR. Always produces CONCEPT.md + manifest.json, and conditionally produces runnable implementation artifacts when the content is truly software/coding. USE when a user provides a YouTube URL and wants to learn, understand, implement from, or deeply study a video. Also trigger when a webhook/hook payload contains a YouTube URL with a learning/tutorial intent. Trigger phrases: "create a tutorial from this video", "help me learn from this YouTube link", "implement what's shown in this video", "turn this YouTube video into a guide", "make me study notes from this", "explain and implement this YouTube tutorial", "I want to implement this", "break down this video for me".
Converts a YouTube tutorial into durable, re-usable learning artifacts. The output should be understandable without watching the video.
Mandatory Dependency: Always invoke the
youtube-transcript-metadataskill first (Step 3). Never fetch transcripts or metadata inline — the transcript skill is the single source of truth.
| File | Required | Purpose |
|---|---|---|
manifest.json | ✅ Always | Provenance, status, retention map |
README.md | ✅ Always | One-page summary with metadata context block |
CONCEPT.md | ✅ Always | Standalone tutorial — understandable without watching the video |
IMPLEMENTATION.md | ✅ Usually | Prerequisites/steps; for concept-only this can be procedural (recipe/runbook) |
implementation/ | ⬜ Conditional | Runnable code/scripts only for software/coding tutorials |
visuals/gemini_video_analysis.md | ⬜ Best-effort | Timestamped visual analysis from Gemini |
research/sources.md | ⬜ When gaps exist | Gap-filling sources and citations |
transcript.clean.txt | ⬜ Recommended | Deduplicated transcript (retention: temp) |
All durable outputs go under:
<resolved_artifacts_root>/youtube-tutorial-creation/{YYYY-MM-DD}/{video-slug}__{HHMMSS}/
⚠️ Path hygiene: Never put a literal
UA_ARTIFACTS_DIRin a path.
BAD:UA_ARTIFACTS_DIR/...or/opt/universal_agent/UA_ARTIFACTS_DIR/...
GOOD: Resolve the root first (Step 2 below), then appendyoutube-tutorial-creation/...
Never rely on Bash inheriting env vars. Resolve the artifacts root explicitly:
python3 -c "from universal_agent.artifacts import resolve_artifacts_dir; print(resolve_artifacts_dir())"
If this fails, STOP and report the error. Do not fall back to writing only under transient scratch.
Then:
video-slug from the video title (preferred) or URL video ID.manifest.json immediately — fill fields as you go.Tool preference: Use
write_text_filefor all writes intoUA_ARTIFACTS_DIR. NativeWritemay be restricted to the session workspace depending on runtime.
Run the core ingestion script with parallel transcript + metadata extraction:
UV_CACHE_DIR=/tmp/uv_cache uv run .claude/skills/youtube-transcript-metadata/scripts/fetch_youtube_transcript_metadata.py \
--url "<YOUTUBE_URL>" \
--language en \
--json-out "$CURRENT_RUN_WORKSPACE/downloads/youtube_ingest.json" \
--transcript-out "$CURRENT_RUN_WORKSPACE/downloads/transcript.txt" \
--pretty
CURRENT_RUN_WORKSPACE is the canonical workspace variable. CURRENT_SESSION_WORKSPACE
may still be present as a legacy alias during migration, but new examples should prefer
the run workspace name.
Read youtube_ingest.json and look at ok, failure_class, transcript_text, and metadata.
Degraded-mode decision tree:
| Situation | Action |
|---|---|
ok=true | Continue to Step 3b (normal path) |
ok=false, failure_class=request_blocked | Retry once; if still blocked, proceed degraded with metadata only |
ok=false, failure_class=empty_or_low_quality_transcript | Use Gemini visual analysis as primary source; set status degraded_transcript_only |
ok=false, metadata succeeded | Preserve metadata in manifest; proceed degraded |
| Both transcript and metadata failed | Set status failed; still write manifest.json + README.md with error detail |
Source-of-truth policy:
YouTubeTranscriptApi().fetch(video_id) — never get_transcript() (legacy)yt-dlp only — never use yt-dlp for transcript textCaption transcripts often contain consecutive duplicate lines. Deduplicate:
python3 - <<'PY'
import os
from pathlib import Path
ws = Path(
os.environ.get("CURRENT_RUN_WORKSPACE")
or os.environ["CURRENT_SESSION_WORKSPACE"]
)
src = ws / "downloads" / "transcript.txt"
dst = ws / "downloads" / "transcript.clean.txt"
lines = src.read_text(encoding="utf-8", errors="replace").splitlines()
out, prev = [], None
for line in lines:
if line != prev:
out.append(line)
prev = line
dst.write_text("\n".join(out).strip() + "\n", encoding="utf-8")
print(f"Wrote {dst} ({len(out)} lines, from {len(lines)} original)")
PY
Do a full-file read of the cleaned transcript to avoid "cut-off" issues from partial previews, and produce a stats snapshot for the manifest:
python3 - <<'PY'
import json, os
from pathlib import Path
ws = Path(
os.environ.get("CURRENT_RUN_WORKSPACE")
or os.environ["CURRENT_SESSION_WORKSPACE"]
)
src = ws / "downloads" / "transcript.clean.txt"
dst = ws / "downloads" / "transcript.stats.json"
text = src.read_text(encoding="utf-8", errors="replace")
lines = text.splitlines()
stats = {
"path": str(src),
"bytes": len(text.encode("utf-8", errors="replace")),
"chars": len(text),
"lines": len(lines),
"head": lines[:5],
"tail": lines[-5:],
}
dst.write_text(json.dumps(stats, indent=2), encoding="utf-8")
print(f"Wrote {dst} ({stats['lines']} lines, {stats['chars']} chars)")
PY
Carry key fields from youtube_ingest.json into:
manifest.json: title, channel, duration, upload_date, metadata_status, metadata_sourceREADME.md: context block at the top (URL, title, channel, duration, upload date)Attempt Gemini multimodal analysis against the YouTube URL:
UV_CACHE_DIR=/tmp/uv_cache uv run .claude/skills/youtube-tutorial-creation/scripts/gemini_video_analysis.py \
--url "<YOUTUBE_URL>" \
--out "<run_dir>/visuals/gemini_video_analysis.md" \
--json-out "<run_dir>/visuals/gemini_video_analysis.json"
If the script fails (no API key, model unavailable, rate limited), do not skip the whole skill.
Set extraction.visual = "attempted_failed" in the manifest and continue with transcript only.
Do NOT skip visual analysis just because you assume the transcript is sufficient. Attempt it, record the result, and proceed.
Self-test (verifies imports only, no API call):
UV_CACHE_DIR=/tmp/uv_cache uv run .claude/skills/youtube-tutorial-creation/scripts/gemini_video_analysis.py --self-test
Merge "what they said" (transcript) with "what they showed" (visual findings):
research/sources.md with URLs and access datesWrite each artifact to the run directory. Quality bar for each:
README.md
CONCEPT.md and IMPLEMENTATION.mdCONCEPT.md — standalone tutorial, no video required
visuals/ when availablevisuals/code-extractions/IMPLEMENTATION.md — practical runbook (or procedural guide for concept-only)
implementation/ scriptsimplementation/ — runnable code (only when learning_mode=concept_plus_implementation)
visuals/code-extractions/Update manifest.json to its final state. See references/output_contract.md for the
full schema. Required fields:
{
"skill": "youtube-tutorial-creation",
"status": "full | degraded_transcript_only | failed",
"learning_mode": "concept_only | concept_plus_implementation",
"video_url": "...",
"video_id": "...",
"source": "manual | composio | direct",
"metadata": { "title": "...", "channel": "...", "duration": 0, "upload_date": "...", "metadata_status": "...", "metadata_source": "yt_dlp" },
"extraction": {
"transcript": "attempted_succeeded | attempted_failed | not_attempted",
"metadata": "attempted_succeeded | attempted_failed | not_attempted",
"visual": "attempted_succeeded | attempted_failed | not_attempted"
},
"outputs": { "CONCEPT.md": "...", "IMPLEMENTATION.md": "...", "manifest.json": "..." },
"retention": { "transcript.txt": "temp", "transcript.clean.txt": "temp" },
"notes": []
}
Every Python script in implementation/ must be runnable without a separate venv.
Requirements:
# /// script
# requires-python = ">=3.11"
# dependencies = ["google-genai>=1.0.0", "python-dotenv>=1.0.1"]
# ///
--self-test flag that checks imports + basic object construction without requiring secrets.load_dotenv(find_dotenv(usecwd=True)).GOOGLE_API_KEY, GEMINI_API_KEY, Z_AI_API_KEY, ANTHROPIC_API_KEY.UV_CACHE_DIR=/tmp/uv_cache uv run implementation/<script>.py --self-testHard rules:
Do NOT runpip install,uv pip install, oruv addas part of a skill run.
If deps are missing, fix the PEP 723 header and re-run viauv run.
SDK drift note: Prefer google.genai for all new code. If the tutorial video uses
google.generativeai, note it as "video used legacy SDK" and implement with the current SDK.
UA_ARTIFACTS_DIR (never only in run scratch)manifest.json exists, is accurate, and has a valid statusCONCEPT.md is understandable without watching the videoimplementation/ scripts pass --self-test (or has clear documented reason if not applicable)Read these when you need deeper detail:
| File | When to read |
|---|---|
references/output_contract.md | Full manifest schema, required vs optional files, status/mode values |
references/ingestion_and_tooling.md | Tool selection decision matrix, runtime strategy |
references/composio_wiring_checklist.md | Composio + webhook ingress validation |