Transform lecture materials (slides, PDFs, text snippets, transcripts, audio, or video recordings) into comprehensive, exam-ready Markdown notes for Obsidian. Use when the user provides lecture content, study materials, or asks to take notes, summarize lectures, or create study guides. Outputs structured notes with key concepts, definitions, LaTeX math, and exam-important callouts.
Transform lecture materials into structured, exam-ready Markdown notes for Obsidian.
When the source material is an audio or video file, always create a local transcript first and use that transcript as the primary input for note generation.
openai/gpt-4o-audio-mini and the input_audio format.ffmpeg, then transcribe that extracted audio with OpenRouter..txt transcript is the canonical note-generation source.Before using audio or video transcription, make sure OPENROUTER_API_KEY is defined in the environment.
Use OpenRouter's chat completions API with multimodal input_audio. The prompt should explicitly ask for a verbatim transcript with minimal cleanup.
AUDIO_BASE64=$(base64 < "Lecture 05.audio.wav" | tr -d '\n')
curl https://openrouter.ai/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-d "{\n \"model\": \"openai/gpt-4o-audio-mini\",\n \"messages\": [\n {\n \"role\": \"user\",\n \"content\": [\n {\n \"type\": \"text\",\n \"text\": \"Transcribe this lecture audio as plain text. Keep the wording faithful to the speaker, preserve important structure, and do not summarize.\"\n },\n {\n \"type\": \"input_audio\",\n \"input_audio\": {\n \"data\": \"$AUDIO_BASE64\",\n \"format\": \"wav\"\n }\n }\n ]\n }\n ]\n }"
For actual skill usage, save the JSON response first, then extract the transcript into a plain text file:
AUDIO_BASE64=$(base64 < "Lecture 05.audio.wav" | tr -d '\n')
curl https://openrouter.ai/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-d "{\n \"model\": \"openai/gpt-4o-audio-mini\",\n \"messages\": [\n {\n \"role\": \"user\",\n \"content\": [\n {\n \"type\": \"text\",\n \"text\": \"Transcribe this lecture audio as plain text. Keep the wording faithful to the speaker, preserve important structure, and do not summarize. Return only the transcript text.\"\n },\n {\n \"type\": \"input_audio\",\n \"input_audio\": {\n \"data\": \"$AUDIO_BASE64\",\n \"format\": \"wav\"\n }\n }\n ]\n }\n ]\n }" \
> "Lecture 05.transcript.response.json"
jq -r '.choices[0].message.content // (.choices[0].message.content[]? | select(.type == "text") | .text)' \
"Lecture 05.transcript.response.json" > "Lecture 05.transcript.txt"
Given an input file like Lecture 05.mp4 or Lecture 05.m4a, save derived files alongside it using predictable names:
Lecture 05.audio.wav for audio extracted from video or normalized for uploadLecture 05.transcript.response.json for the OpenRouter API responseLecture 05.transcript.txt for the plain text transcript used to generate notesUse ffmpeg to extract audio from video before transcription. A safe default is:
ffmpeg -i "Lecture 05.mp4" -vn -ac 1 -ar 16000 -c:a pcm_s16le "Lecture 05.audio.wav"
If the source is already audio but not in a convenient format, normalize it to .wav before base64 encoding and upload.
Output all notes inside fenced codeblocks. When content spans multiple chapters or distinct topics, create separate files.
Chapter X - Title.md or Unit X - Title.md
---