技能档案

Mk Youtube Transcript Summarize

Name: Mk Youtube Transcript Summarize
Author: kouko

Summarize YouTube video content with structured output. Use when user wants a detailed summary from a transcript file path.

kouko0 星标2026年2月18日

职业
分类: 媒体

技能内容

YouTube Video Summary

Generate a structured, high-quality summary of a YouTube video from its transcript.

Quick Start

/mk-youtube-transcript-summarize <transcript_file_path> [--force]
/mk-youtube-transcript-summarize --check <URL_or_video_id>

Parameters

Parameter	Required	Default	Description
transcript_file_path	Yes*	-	Path to transcript file (.txt). *Not needed with `--check`.
--force	No	false	Force re-generate summary even if cached file exists
--check	No	false	Check if summary exists for a URL or video_id (no transcript file needed)

相关技能

Mk Youtube Transcript Summarize | Skills Pool

/mk-youtube-get-caption https://youtube.com/watch?v=xxx
→ outputs transcript file path

/mk-youtube-transcript-summarize /path/to/captions/20091025__VIDEO_ID.en.txt
→ generates structured summary saved to {baseDir}/data/20091025__VIDEO_ID.en.md

/mk-youtube-transcript-summarize --check <URL_or_video_id>

{
  "status": "success",
  "exists": true,
  "output_summary": "{baseDir}/data/20091025__dQw4w9WgXcQ.en.md",
  "summary_char_count": 5000,
  "summary_line_count": 120,
  "video_id": "dQw4w9WgXcQ",
  "title": "Video Title",
  "channel": "Channel Name",
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
}

{
  "status": "success",
  "exists": false,
  "video_id": "dQw4w9WgXcQ",
  "title": "",
  "channel": "",
  "url": ""
}

Calculate chunk count: ceil(line_count / 1000) — each chunk is ~1000 lines

Spawn parallel subagents using the Task tool (subagent_type: "general-purpose"):

Each subagent receives: the file path, start line offset, end line limit, and summarization instructions
Chunk overlap: Each chunk includes a 50-line overlap with the previous chunk for context continuity (chunk 1: lines 1–1000, chunk 2: lines 951–2000, chunk 3: lines 1951–3000, etc.). The first chunk has no leading overlap.

Each subagent prompt:

Read the file at {file_path} from line {start_offset} to line {end_limit} using the Read tool (with offset and limit parameters).
Then produce a summary of this section with 5-10 bullet points covering:
- Main topics and arguments discussed
- Key data points (numbers, dates, names) in plain text
- Use sub-bullets for supporting details under a main point (max 2 levels)
- Notable quotes as blockquotes
Write the summary in the same language as the transcript (this will be synthesized into the user's language in the final step).
IMPORTANT — Boundary continuity: If the beginning of your chunk clearly continues a topic from a previous section, prefix your first bullet with [continues from previous]. If the end of your chunk is mid-topic and clearly continues into the next section, suffix your last bullet with [continues to next]. This helps the synthesis step merge cross-chunk topics.

Use model: "haiku" for cost efficiency

Collect all subagent section summaries
Synthesize a final structured summary in the main conversation following the Summary Generation Rules
- During synthesis, check for [continues from previous] and [continues to next] markers across adjacent chunks — merge bullets that belong to the same topic into a single coherent section rather than repeating them

## Video Info (optional)

| Field | Value |
|-------|-------|
| **Title** | {title} |
| **Channel** | {channel} |
| **Duration** | {duration_string} |
| **Views** | {view_count, formatted with commas} |
| **Upload Date** | {upload_date, formatted as YYYY-MM-DD} |
| **Subtitle** | {subtitle_type} ({transcript_language}) |
| **URL** | {url} |

## Content Summary

#### {Section Title 1}

- Main point expressed concisely
  - Supporting detail or data point
- Another independent point
- ...

#### {Section Title 2}

- ...

(Continue for all logical sections)

## Key Takeaways

- 5-8 most important conclusions or insights from the video

Section structure: Divide the summary into logical sections using H4 (####) headings
- If the video description contains chapter timestamps, use those as the section skeleton
- Otherwise, identify 5-10 logical topic shifts in the transcript
- Each section should have 3-7 bullet points
- Prefer more sections with focused content over fewer sections with broad content
Bullet formatting:
- Each bullet should express one main point concisely (target: under 60 characters for CJK, under 120 characters for English)
- When a bullet has 2+ supporting details (data, examples, sub-arguments), use second-level sub-bullets
- NEVER nest beyond 2 levels (no sub-sub-bullets)
- When details are independent facts without a shared parent topic, keep them as separate top-level bullets
Ordering:
- Arrange sections in the order topics appear in the video (chronological/narrative flow)
- Within each section, list bullets in the order they were discussed in the transcript (source order)
- Do NOT reorder bullets by perceived importance — preserve the speaker's narrative progression
Data preservation: Always preserve specific data points in plain text
- Percentages, dollar amounts, dates, statistics — include as-is, no formatting
- Person names, company names, product names — include as-is, no formatting
- Direct quotes that are impactful → use blockquote format
- Do NOT use inline bold (**text**) anywhere in the summary body (Content Summary and Key Takeaways sections)
Language: Write the summary in the user's conversation language
- Detect the language the user is using in the current conversation
- Write the entire summary (section titles, bullet points, key takeaways) in that language
- Example: If user speaks Traditional Chinese, summarize in 繁體中文 regardless of transcript language
Length: Target compression ratio based on processing strategy:

Strategy Compression Guideline
standard 20-30% Short content, detailed coverage
sectioned 15-20% Medium-long content, balanced density
chunked 10-15% Very long content, high-level synthesis
Tone: Maintain an informative, neutral tone
- Present the speaker's arguments faithfully
- Do not add opinions or editorial commentary
- Use active voice
Key Takeaways: End with 5-8 bullet points summarizing the most important insights
- These should be standalone — understandable without reading the full summary
- Prioritize actionable insights and surprising findings
- Order by importance/impact (most significant insight first, descending)

---
Summary saved to: `{output_summary path from JSON}`

{
  "status": "success",
  "source_transcript": "/path/to/20091025__VIDEO_ID.en.txt",
  "output_summary": "{baseDir}/data/20091025__VIDEO_ID.en.md",
  "char_count": 30000,
  "line_count": 450,
  "strategy": "standard",
  "cached": false,
  "video_id": "dQw4w9WgXcQ",
  "title": "Video Title",
  "channel": "Channel Name",
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
}

{
  "status": "success",
  "source_transcript": "/path/to/20091025__VIDEO_ID.en.txt",
  "output_summary": "{baseDir}/data/20091025__VIDEO_ID.en.md",
  "cached": true,
  "summary_char_count": 5000,
  "summary_line_count": 120,
  "video_id": "dQw4w9WgXcQ",
  "title": "Video Title",
  "channel": "Channel Name",
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
}

Mk Youtube Transcript Summarize

YouTube Video Summary

Quick Start

Parameters

Mk Youtube Transcript Summarize

YouTube Video Summary

Quick Start

Parameters

Examples

Check Mode

Check Mode Examples

Check Mode Output

How it Works

Processing Strategy

Strategy: `standard` (< 80,000 chars, ~2 hr EN)

Strategy: `sectioned` (80,000–200,000 chars, ~2 hr – 5 hr EN)

Strategy: `chunked` (> 200,000 chars, > 5 hr EN)

Fallback Rules

Summary Generation Rules

Output Structure

Video Info Table

Content Rules

Save Summary to File

Output Format

Normal Mode Output

Check Mode Output

Notes

Songsee

Video Frames

Gifgrep

Qqbot Media

Camsnap

Openai Whisper Api

Strategy	Compression	Guideline
`standard`	20-30%	Short content, detailed coverage
`sectioned`	15-20%	Medium-long content, balanced density
`chunked`	10-15%	Very long content, high-level synthesis

Mk Youtube Transcript Summarize

YouTube Video Summary

Quick Start

Parameters

Mk Youtube Transcript Summarize

YouTube Video Summary

Quick Start

Parameters

Examples

Check Mode

Check Mode Examples

Check Mode Output

How it Works

Processing Strategy

Strategy: standard (< 80,000 chars, ~2 hr EN)

Strategy: sectioned (80,000–200,000 chars, ~2 hr – 5 hr EN)

Strategy: chunked (> 200,000 chars, > 5 hr EN)

Fallback Rules

Summary Generation Rules

Output Structure

Video Info Table

Content Rules

Save Summary to File

Output Format

Normal Mode Output

Check Mode Output

Notes

Songsee

Video Frames

Gifgrep

Qqbot Media

Camsnap

Openai Whisper Api

Strategy: `standard` (< 80,000 chars, ~2 hr EN)

Strategy: `sectioned` (80,000–200,000 chars, ~2 hr – 5 hr EN)

Strategy: `chunked` (> 200,000 chars, > 5 hr EN)