Read and extract content from a specific paper or book in the user's Zotero library. Use this skill when the user wants to know what a specific item says about a topic, needs content from particular sections/pages, or asks questions that require reading the actual document. Triggers on: "what does [paper] say about X", "read chapter 3", "find the section on X in [book]", "extract the equation for X", "summarize [paper]", or any request that requires extracting content from a known Zotero item. This skill assumes the item is already identified — if the user hasn't specified which paper/book, use zotero-find first.
Extract and present content from a specific item in the user's Zotero library, with precise page references. Downloads and caches PDFs on demand via the Zotero Web API, then uses pypdf for page-range extraction and outline discovery.
Same as zotero-find: credentials via env vars or ~/.config/zotero-assistant/env (see zotero-find SKILL.md for details), plus uv. The script at scripts/zotero_read.py declares its dependencies inline (httpx, pypdf) and bootstraps them via uv run. First run downloads deps (~5s); subsequent runs are instant.
PDFs are cached at /tmp/zotero-cache/<attachment_key>.pdf. Override with ZOTERO_CACHE_DIR=/your/path. To force a re-download, delete the cached file.
The script lives at scripts/zotero_read.py in the same directory as this SKILL.md. When installed via npx skills add, this resolves to ~/.agents/skills/zotero-read/scripts/zotero_read.py. In all examples below, stands for that absolute path.
zotero_read.pyThe caller should supply a Zotero item key (8-character uppercase, e.g. DFXGQV8D). If the user hasn't named the item precisely, use zotero-find first to resolve it.
Before extracting content, grab abstract + outline + annotations in parallel. These tell you where to look and surface the user's highlighted passages (which are often the fastest path to the answer):
uv run zotero_read.py abstract KEY # title, creators, year, abstract, tags
uv run zotero_read.py outline KEY # section headings with page numbers
uv run zotero_read.py annotations KEY # user highlights and notes
Run these as three parallel tool calls in one turn.
The outline output comes from the PDF's embedded outline when present (common for arxiv preprints, LaTeX/elsarticle papers, books with real TOCs) or from a heuristic scan of the first 20 pages (for older scans, thesis PDFs, or papers without embedded metadata). The JSON includes source: "embedded" | "heuristic" so you know which you got.
Why annotations first: if the user has highlighted relevant passages, those are the fastest path to the answer — no page scanning needed.
If the user asks about a specific term/concept and the outline doesn't make it obvious which pages to read, use search to locate it:
uv run zotero_read.py search KEY "vertex-patch"
Returns all pages containing the term (case-insensitive) with surrounding-line context snippets. Use the page numbers from the results to target your pages extraction in Step 3.
This bridges the gap between fulltext (fast but no page numbers) and pages (precise but requires knowing which pages). Especially useful for large documents where scanning the outline isn't enough.
Use the fulltext subcommand — it returns Zotero's pre-indexed text, no PDF download, no page numbers needed:
uv run zotero_read.py fulltext KEY
Output includes content (the full text) plus indexed_pages and total_pages. Fast and zero-cost.
Phase 1: targeted extraction. Scan the full outline (from Step 2) for sections relevant to the user's question. A topic may appear in multiple chapters. Pick the top 3–4 most relevant sections (~30 pages total), and extract them in parallel:
uv run zotero_read.py pages KEY 14-18 # Section 4.2
uv run zotero_read.py pages KEY 19-21 # Section 4.3
uv run zotero_read.py pages KEY 5-7 # Section 2.1 for background
The first call downloads the PDF and caches it; subsequent calls use the cache (~1s vs ~3s).
Present Phase 1 results first. If the user wants more depth, extract additional ranges in Phase 2.
If the user says "tell me more" or "what about section X", pull additional page ranges using the same pages subcommand. The outline and Phase 1 results inform where to look next.
pages KEY RANGE returns JSON:
{
"item": "DFXGQV8D",
"attachment": "4DQ2NUMB",
"citation": "Wichrowski (2025)",
"total_pages": 21,
"requested_range": "1-2",
"extracted_range": "1-2",
"text": "\n--- Page 1 ---\n<content>\n\n--- Page 2 ---\n<content>\n"
}
The text field has --- Page N --- markers between pages so you can attribute quotes to specific pages. Use the citation field for short inline references.
The user must always distinguish between what the document says and your interpretation. Mixing these presents hallucinated content as real.
1. Direct quote — blockquote with exact page reference:
"The exponential filter is applied in the modal space, damping the highest-order modes..." — Hesthaven & Warburton (2008), Section 6.1, p. 134
2. Your summary — clearly labeled:
Summary: The authors use modal filtering to stabilize aliasing-driven instabilities. The filter transfer function (Eq. 6.3, p. 134) controls which modes are damped.
pages KEY RANGE. Text from fulltext has no page attribution — don't invent page numbers for it.(Author Year, Section X.Y, p. N) for single pages(Author Year, Section X.Y, pp. N–M) for rangesuv run zotero_read.py bibtex KEY
Returns the item's BibTeX entry directly from Zotero (includes the citekey Zotero has assigned).
| Subcommand | Purpose | Downloads PDF? |
|---|---|---|
metadata KEY | Full item metadata as JSON | No |
abstract KEY | Title, creators, year, abstract, tags | No |
children KEY | List attachments, notes, annotations | No |
bibtex KEY | BibTeX export | No |
fulltext KEY | Zotero's indexed text (no page numbers) | No |
outline KEY [--scan-pages N] | Section headings with page numbers (heuristic scans N pages, default 20) | Yes (first use) |
pages KEY RANGE | Extract page range, e.g. 10-15 or 7 | Yes (first use) |
search KEY TERM | Find all pages containing TERM, with context snippets | Yes (first use) |
annotations KEY | User highlights on the PDF | No |
notes KEY | Standalone notes attached to the item | No |
fulltext over many pages calls for short papers — one HTTP call instead of downloading the PDF./tmp/zotero-cache/. Repeated reads of the same paper are instant.pages/outline/annotations/fulltext subcommands will exit with code 3 and a clear message. Fall back to abstract or metadata and inform the user.0 success1 HTTP or PDF parse error2 missing env vars, invalid arguments, or invalid page range3 no stored PDF attachment under the item