Skill ファイル

Summarize Call

Name: Summarize Call
Author: reysu

Transcribe a call recording with speaker diarization, summarize it, and create Obsidian vault notes (call note, transcript, person notes for participants). Works with video or audio files. Supports local transcription (mlx_whisper + pyannote) or ElevenLabs Scribe.

reysu115 スター2026/04/13

職業
カテゴリ: ドキュメント

スキル内容

Takes a call recording (video or audio), transcribes it with speaker labels, summarizes it, and writes structured notes into your Obsidian vault.

Requirements

Vault structure — the skill expects these folders inside your Obsidian vault. Folder names are defaults; override them in the Configuration block below if your vault uses different names.

Folder	Purpose
`03 Meetings/`	Where call notes and transcripts land
`04 People/`	Person notes for participants
`02 Daily/YYYY/MM/`	Daily notes, named `MM-DD-YY ddd.md`
`_Templates/`	Note templates — skill installs `new person template.md` here on first run

CLI tools — install these before first use, or let Step 0 walk you through it:

Tool	Purpose	Install

関連 Skill

Summarize Call | Skills Pool

VAULT_ROOT       = $VAULT_ROOT        # auto-detected if not set (see Step 0a)
MEETINGS_DIR     = 03 Meetings
PEOPLE_DIR       = 04 People
DAILY_DIR        = 02 Daily
TEMPLATES_DIR    = _Templates

vault=""
if [ -n "$VAULT_ROOT" ]; then
  vault="$VAULT_ROOT"
else
  dir="$PWD"
  while [ "$dir" != "/" ]; do
    if [ -d "$dir/.obsidian" ]; then vault="$dir"; break; fi
    dir="$(dirname "$dir")"
  done
fi
echo "Vault: ${vault:-NOT FOUND}"

for d in "$MEETINGS_DIR" "$PEOPLE_DIR" "$DAILY_DIR" "$TEMPLATES_DIR"; do
  [ -d "$VAULT_ROOT/$d" ] || echo "MISSING: $d"
done

command -v ffmpeg >/dev/null 2>&1 || echo "MISSING: ffmpeg"

command -v mlx_whisper >/dev/null 2>&1 || echo "MISSING: mlx_whisper"

PYANNOTE_ENV="${XDG_DATA_HOME:-$HOME/.local/share}/summarize-call/pyannote-env"
[ -d "$PYANNOTE_ENV" ] || echo "MISSING: pyannote venv"

mkdir -p "$(dirname "$PYANNOTE_ENV")"
# Create venv with uv (or python3 -m venv if uv not installed)
uv venv "$PYANNOTE_ENV"
source "$PYANNOTE_ENV/bin/activate"
uv pip install pyannote.audio torch torchaudio

[ -n "$HF_TOKEN" ] || echo "MISSING: HF_TOKEN"

[ -n "$ELEVENLABS_API_KEY" ] || echo "MISSING: ELEVENLABS_API_KEY"

ffmpeg -v quiet -i "<input>" -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/<name>.wav -y

mlx_whisper --model mlx-community/whisper-large-v3-turbo --language en \
  --output-dir /tmp --output-format json \
  --condition-on-previous-text False /tmp/<name>.wav

from pyannote.audio import Pipeline
import torch, os

pipeline = Pipeline.from_pretrained(
    "pyannote/speaker-diarization-3.1",
    use_auth_token=os.environ["HF_TOKEN"]
)
device = "cuda" if torch.cuda.is_available() else ("mps" if torch.backends.mps.is_available() else "cpu")
pipeline.to(torch.device(device))  # GPU if available, CPU otherwise

output = pipeline("/tmp/<name>.wav", num_speakers=<N>)
annotation = output.speaker_diarization
for turn, _, speaker in annotation.itertracks(yield_label=True):
    # save turn.start, turn.end, speaker

import requests, os

url = "https://api.elevenlabs.io/v1/speech-to-text"
headers = {"xi-api-key": os.environ["ELEVENLABS_API_KEY"]}

with open("/tmp/<name>.wav", "rb") as f:
    response = requests.post(
        url,
        headers=headers,
        files={"file": f},
        data={
            "model_id": "scribe_v1",
            "language_code": "<lang>",  # e.g. "eng", "jpn"
            "diarize": "true",
            "timestamps_granularity": "word",
            "num_speakers": <N>
        }
    )

result = response.json()

Location: $MEETINGS_DIR/<MM-DD-YY Day Participant1 x Participant2> Transcript.md
Content: the merged, speaker-labeled transcript with timestamps

Frontmatter:

---
date: YYYY-MM-DD
duration: <seconds>
meeting: "[[<Call Note Title>]]"
unread: true
---

Location: $MEETINGS_DIR/<MM-DD-YY Day Participant1 x Participant2>.md

Frontmatter:

---
created: YYYY-MM-DDTHH:MM
updated: YYYY-MM-DDTHH:MM
tags: [call]
date: YYYY-MM-DD
start: HH:MM
end: HH:MM
duration: <seconds>
people: ["[[Participant 1]]", "[[Participant 2]]"]
summary: "1-line description of call topics"
transcript: "[[<Call Note Title> Transcript]]"
unread: true
---

No # Title heading — filename is the title

Body structure:

> [!tldr]
> [2-3 sentence overview]

## Key Topics
- ...

## Decisions
- ...

## Action Items
- [ ] ...

## Notable Quotes
> [!quote] [[Participant 1]]
> "..."

## People Mentioned
- [[Person Name]] — brief context

summary frontmatter field is mandatory — never omit it
Wikilink everything — people, companies, products, concepts, places

Update $VAULT_ROOT/$DAILY_DIR/YYYY/MM/MM-DD-YY ddd.md (create YYYY/MM/ if missing)
No # Title heading — filename is the title
Set unread: true in frontmatter

Add under a ## calls/meetings section:

- [[<Call Note Title>]] — brief description

grep -oE '\[\[[^]|#^]+' "<call_note_path>" | sed 's/\[\[//' | sort -u
for term in <each>; do
  found=$(find "$VAULT_ROOT" -name "$term.md" -not -path "*/.Trash/*" 2>/dev/null | head -1)
  [ -z "$found" ] && echo "MISSING: $term"
done

Summarize Call

Requirements

Summarize Call

Requirements

Configuration

Trigger

Inputs

Step 0: Bootstrap check (first run)

0a. Resolve the vault root

0b. Check required folders

0c. Check required CLI tools

0d. Install the person template if missing

Step 0.5: Determine depth mode

Step 1: Choose transcription method

Option A: Local — walkthrough if not set up

Option B: ElevenLabs Scribe — walkthrough if not set up

Step 2: Extract audio

Step 3: Transcribe + diarize

Option A: Local (mlx_whisper + pyannote)

Option B: ElevenLabs Scribe

Step 4: Summarize

Step 5: Create vault notes

Transcript file

Call note

Person notes

Daily note

Step 6: Handle mid-call name-drops

Detailed mode

Minimal mode

Key rules

Feishu Doc

Summarize

Nano Pdf

Diffs

Customs Trade Compliance

Nutrient Document Processing