Obsidian Reading Pipeline

Extract content from URLs and create properly linked vault notes and review notes.

Extraction Script

# Single URL extraction
uv run python .claude/skills/obsidian-read/scripts/extract_url.py <url> --json

# Parse URLs from raw inbox
uv run python .claude/skills/obsidian-read/scripts/extract_url.py --from-file "staging/To Read Later.md" --json

# Verbose output for debugging
uv run python .claude/skills/obsidian-read/scripts/extract_url.py <url> --verbose

Fallback chain: Jina Reader (primary, ~0.3s) → Wayback Machine (archived snapshots).

Optional local alternative: defuddle-cli (by kepano, Obsidian creator) — npm install -g defuddle-cli && defuddle <url>. No API calls, good for paywalled content you have access to locally.

Modes

Single URL Mode ("read this URL")

Extract content via extract_url.py <url> --json
Present summary: title, word count, extraction method, content size category, content preview
Check staging/ for existing notes containing the URL or matching keywords from the title — offer to use existing content instead of creating from scratch
Ask user:
- Parent MOC: suggest based on content (e.g., Machine Learning, Generative AI Resources, Software Engineering)
- Tags: suggest 1-3 domain tags from the canonical tag list in CLAUDE.md
Create full text note — always created by default:
```
---
type: literature
title: "Article Title"
source: "Author, Title (Year)"
url: "https://..."
created: YYYY-MM-DD
up: "[[Article Title Review Note]]"
tags:
  - literature
  - full-text
  - domain-tag
---
```
Named Article Title Full Text.md. Contains the verbatim extracted markdown. Points up: to the review note (NOT the MOC) to avoid cluttering MOC listings. Tagged full-text for easy dataview filtering.
Create review note at vault root:
```
---
type: review
title: "Article Title"
source: "Author, Title (Year)"
url: "https://..."
full-text: "[[Article Title Full Text]]"
created: YYYY-MM-DD
up: "[[Domain MOC]]"
related:
  - "[[Related Note]]"
tags:
  - review
  - domain-tag
---
```
The full-text: property links to the literature note with the complete article text.

Review notes include sections: Summary, Key Takeaways, Personal Reflection

Important: up: points to the domain MOC (e.g., Machine Learning, Generative AI Resources), NOT a generic "Reviews MOC" or "Sources MOC". See obsidian-review skill. Important: Only the review note gets added to the MOC. The full text note stays out of MOC listings — it's discoverable via the full-text: property and the full-text tag.

File	Role
`staging/To Read Later.md`	Raw URL inbox. Clip URLs here from phone, browser, etc. This is the processing queue.
`To Read Later.md`	Curated reading list. Checkbox entries with 1-line context + link to extracted vault note. Unread / Read sections.

Error	Suggestion
Jina Reader HTTP 402	Rate limited — wait 30s and retry
Jina Reader HTTP 403	Site blocks Jina — Wayback fallback will be tried automatically
No Wayback snapshot	Article too new or not indexed — try `defuddle-cli` locally, or paste content manually
Timeout	Increase with `--timeout 60`
Empty content	Site may use heavy JS rendering — try `defuddle-cli` or paste manually

Obsidian Read | Skills Pool

Obsidian Read

Obsidian Read

Obsidian Reading Pipeline

Extraction Script

Modes

Single URL Mode ("read this URL")

Batch Mode ("process reading list")

Two-File Reading List

Review Matching

URL Normalization

Cross-Skill References

Error Handling

Goplaces

Research Ops

Editor

Fact Checker

Deep Research

Academic Researcher