Transcribe a meeting recording from the Rodecaster SD card, Google Drive, or a local file. Creates a meeting note with summary, decisions, and action items, plus an MP3 archive. Use when the user types /transcribe-meeting or asks to transcribe a recording.
Transcribe a meeting recording and create a structured meeting note in the Obsidian vault with summary, decisions, action items, and full transcript.
Input sources (checked in order):
Recording modes:
obsidian vault info=path to get the vault root.Try sources in order:
Option A: Auto-Detect (Primary)
If no explicit URL or file path was given:
Discover screen recordings:
bash skills/obsidian-transcribe-meeting/scripts/find-screenrecordings.sh "{date}"
Save the JSON output to a temp file (e.g., /tmp/screenrecs-{date}.json).
Discover Rodecaster recordings:
OBSIDIAN_RODECASTER_MOUNT=/run/media/olivier/RodeCaster/RODECaster bash skills/obsidian-transcribe-meeting/scripts/find-recordings.sh "{date}"
If OBSIDIAN_RODECASTER_MOUNT fails (directory not found), fall back without it to try auto-detect.
Save the JSON output to a temp file (e.g., /tmp/rodecaster-{date}.json).
Match recordings by time overlap:
bash skills/obsidian-transcribe-meeting/scripts/match-recordings.sh /tmp/screenrecs-{date}.json /tmp/rodecaster-{date}.json
This produces groups with mode (omarchy+rodecaster, omarchy-only, rodecaster-only), video, audio, and transcribe_from fields.
Check idempotency — for each group, search for existing meeting notes:
recording: field (for groups with Rodecaster audio):
obsidian search query='recording: "{folder}"' path=Meetings
video_file: field (for groups with screen recordings):
obsidian search query='video_file: "{filename}"' path=Meetings
Skip any group that already has a meeting note.
Present findings to the user with mode info for each group:
Found 2 recording groups:
Option B: Google Drive URL
If the input contains drive.google.com:
bash skills/obsidian-transcribe-meeting/scripts/download-gdrive.sh "<url>"
audio_url field (see Idempotency section).Option C: Local File Path
If the input is a local file path, use it directly.
Before transcription, gather participant names and project vocabulary to improve whisper accuracy and action item attribution. Run the three tracks below in parallel to minimize added time.
Track 1: Screenshots (authoritative for participants)
Find screenshots taken during the meeting window:
bash skills/obsidian-transcribe-meeting/scripts/find-screenshots.sh "{date}" "{start_time}" "{duration_secs}"
Save the JSON result for reuse in Phase 3.5. For each screenshot found, read the image and look for:
Collect all names found across all screenshots. These are the authoritative participants. If screenshots show who is in the meeting, that list takes priority over all other sources. Do not add names from project pages or other indirect sources that contradict or dilute what the screenshots show.
Track 2: Vault context (vocabulary only, not participants)
Daily Notes/{date}.md)Projects/{project}.md) and extract:
Important: Do NOT pull team member names from project pages to use as participants. A project page lists everyone on the team, not who attended this specific meeting. Using project-level names causes misattribution (e.g., listing a client in an internal standup).
Track 3: Quick first-pass
ffmpeg -i "{audio-file}" -t 90 -c:a pcm_s16le -ar 16000 -ac 1 /tmp/firstpass_$$.wav -y -loglevel warning
bash skills/obsidian-transcribe-meeting/scripts/transcribe.sh /tmp/firstpass_$$.wav "{engine}"
rm /tmp/firstpass_$$.wavCombine results
Participant list priority (highest to lowest):
Do NOT add names from project pages to the participant list. Project pages provide vocabulary for the whisper prompt only.
Build the whisper prompt:
Meeting participants: Olivier, Kanish, Tara, Dinesh, Adam. Project: KHov. Topics: deployment pipelines, QA issues, container apps.
Store this as WHISPER_PROMPT for Phase 2 and keep the participant list for Phase 3.
If no context was gathered from any track (no screenshots, no daily note match, no names in first-pass), proceed without a prompt. This is not an error.
echo $OBSIDIAN_WHISPER_ENGINE — defaults to openai if unset.WHISPER_PROMPT="{prompt from Phase 1.5}" bash skills/obsidian-transcribe-meeting/scripts/transcribe.sh "<audio-file>" "<engine>"
If no prompt was gathered in Phase 1.5, omit the WHISPER_PROMPT variable.{start, end, text} segments.
speaker field (e.g. "SPEAKER_00")VAD (Voice Activity Detection) is controlled by OBSIDIAN_VAD_MODEL:
silero (default) — Silero VAD strips non-speech segments before transcription. Fast, CPU-only. Reduces Whisper hallucination and improves chunking.none — no VAD, uses silence trimming + fixed chunkingpyannote — pyannote.audio speaker diarization. Provides VAD + speaker labels so the transcript is attributed per-speaker. Requires GPU (ROCm/CUDA) and HF_TOKEN for HuggingFace model access.pyannote,silero — comma-separated fallback chain. Tries pyannote first; if it fails, tries silero; if both fail, falls back to default chunking.On systems with externally-managed Python (e.g. Arch), install torch and pyannote in a venv and set OBSIDIAN_VAD_VENV to its path. The script will use $OBSIDIAN_VAD_VENV/bin/python3 instead of system python3.
When VAD provides speaker labels (pyannote mode), Phase 3 should use them to attribute speech in the transcript and improve summarization (decisions, action items attributed to specific speakers).
Using the transcript segments and the participant list from Phase 1.5, generate:
@Name attribution where possible (checklist, omit if none)Action item attribution rules:
@(Team) rather than guessingFormat the transcript with timestamps:
[H:MM:SS] Text of the segment...
When speaker labels are present (pyannote mode), format as:
[H:MM:SS] **Speaker A**: Text of the segment...
[H:MM:SS] **Speaker B**: Text of the segment...
Speaker labels from pyannote are generic (SPEAKER_00, SPEAKER_01). Replace with real names if participants are known from context, daily notes, or people-context.md. Otherwise use Speaker A, Speaker B, etc.
Create the meeting note with obsidian create path="Meetings/{date} {Title}.md" content="{formatted content}":
---
Determine audio source for transcription:
omarchy+rodecaster → audio = Rodecaster WAV (the audio.path field)omarchy-only → extract audio from screen recording:
bash skills/obsidian-transcribe-meeting/scripts/extract-audio.sh "{video.path}"
Capture the WAV path from stdout.rodecaster-only → audio = Rodecaster WAV (the audio.path field)Edit PDFs with natural-language instructions using the nano-pdf CLI.