Turn video links into structured notes with an independent local-first pipeline: prefer open-source subtitle extraction first, then local audio download and local transcription. Trigger when user sends Bilibili/YouTube/Douyin/Kuaishou links and asks for summary, notes, key points, action items, subtitles, transcript, or markdown output.
This repository root is directly usable as an OpenClaw skill folder: keep SKILL.md and scripts/ together, then copy or symlink this directory into your OpenClaw skills/ directory.
Use a hybrid pipeline that behaves like a product, not a pile of scripts:
yt-dlp in the spirit of BiliNote.cookies.txt path for Bilibili when needed.bilibili, youtube, douyin, or kuaishou.bash {baseDir}/scripts/generate_note.sh --url "<video_url>"
This entry already does the app-like retries:
cookies.txt pathbash {baseDir}/scripts/fetch_subtitle_via_ytdlp.sh --url "<video_url>"
bash {baseDir}/scripts/resolve_transcript.sh --url "<video_url>"
curl -fsS http://127.0.0.1:8691/api/get_all_providers >/dev/null
Quick hybrid entry:
bash {baseDir}/scripts/generate_note.sh \
--url "<video_url>"
Optional flags:
--backend hybrid|local-bilinote (default hybrid)--profile fast|balanced|quality (default fast)--cookies-file /path/to/cookies.txt--cookies-from-browser chromeDefault profile guide:
fast: use local base + beam_size=1, best for OpenClaw day-to-day speedbalanced: use local small + beam_size=2quality: use local small + beam_size=5For OpenClaw-triggered runs, prefer:
bash {baseDir}/scripts/generate_note.sh --url "<video_url>" --profile fast
Do not call generate_note_via_local_backend.sh and do not probe the desktop app unless the user explicitly says to use the legacy BiliNote desktop route.
By default, the skill does not auto-read Chrome/Edge/Brave cookies on macOS, so it avoids unexpected Keychain popups.
Only use --cookies-from-browser ... when you explicitly want that behavior.
For MiniMax note synthesis, the easiest setup is to put the key in:
~/.openclaw/minimax.key
The file can contain just the raw key on one line. openclaw.json remains a fallback source.
For a permanent app-like setup on Bilibili, prefer putting a Netscape-format cookies file at:
~/.openclaw/bilibili.cookies.txt
After that, keep using only:
bash {baseDir}/scripts/generate_note.sh --url "<video_url>"
The skill also remembers the last successful cookie source in:
~/.openclaw/bilinote.cookie-source.tsv
By default, remembered browser-cookie sources are not replayed automatically on macOS, so unexpected Keychain prompts stay avoided.
The skill now also caches successful transcript packs and final notes under:
~/.openclaw/cache/bilinote
This makes repeated runs of the same video much faster. To bypass cache for a one-off retry:
export BILINOTE_DISABLE_CACHE=1
Primary path:
bash {baseDir}/scripts/fetch_subtitle_via_ytdlp.sh --url "<video_url>" --json
bash {baseDir}/scripts/resolve_transcript.sh --url "<video_url>" --json
This keeps the app-like convenience layer while staying independent:
When this skill is triggered:
resolve_transcript.sh --url "<video_url>" --json.Return:
Do not rely on shell scripts alone to write the final note when transcript material is available; the agent should do the final synthesis.
Bundled synthesizer:
node --import tsx {baseDir}/scripts/synthesize_note.ts --input /tmp/transcript.json
It reads MiniMax config from ~/.openclaw/openclaw.json and falls back to heuristic notes if model calls are unavailable.
If the user explicitly wants the legacy desktop path and BiliNote returns 下载音频失败 or [Errno 32] Broken pipe, diagnose before blaming the provider/model:
bash {baseDir}/scripts/diagnose_bilinote.sh
What this usually means:
yt-dlp / ffmpeg / subprocess IO).bilibili and youtube, it is probably not a single bad link.127.0.0.1:8691 is unreachable, the desktop app is not running.When local subtitles are unavailable, or when the user wants a download-first fallback:
bash {baseDir}/scripts/download_audio_direct.sh --url "<video_url>"
whisper CLImlx-whisper on macOS / Apple Siliconfaster-whisperfaster-whisper model directory when available, including the bundled BiliNote model pathDefault quality/speed tradeoff:
bilinote now defaults to profile fast, which maps to Whisper base for better speed in OpenClaw.BILINOTE_WHISPER_MODEL=base.BILINOTE_WHISPER_MODEL=medium.faster-whisper with int8, explicit CPU thread count, beam_size=2, and VAD enabled.BILINOTE_WHISPER_BEAM_SIZE=1.BILINOTE_WHISPER_BEAM_SIZE=5.BILINOTE_WHISPER_VAD=0.BILINOTE_WHISPER_CPU_THREADS=<n>.For Bilibili audio fallback, local transcription now defaults to zh unless you override:
export BILINOTE_AUDIO_LANGUAGE=zh
Recommended setup:
bash {baseDir}/scripts/setup_local_asr.sh mlx-whisper
bash {baseDir}/scripts/transcribe_audio_fallback.sh --audio /path/to/audio.m4a
Use this fallback especially when:
--backend local-bilinote is explicitly chosen.