Convert files and URLs to Markdown using Microsoft's markitdown library via uvx (zero-install). This skill should be used when the user provides a file (PDF, DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, EPUB, Jupyter notebook, audio, ZIP) or a URL (including YouTube, Wikipedia, RSS feeds) and wants its content converted to Markdown for context ingestion. Also triggers when the user explicitly asks to convert a file to Markdown, read a non-text document, extract content from a URL, batch convert documents, or do document analysis.
Convert files and URLs to Markdown using Microsoft's markitdown via uvx (zero-install). Preserves document structure (headings, lists, tables, links) for optimal LLM context ingestion.
Trigger this setup on first install when the user runs /markitdown setup or says "set up markitdown" / "configure markitdown". The goal: register a preference in the user's global Claude config so Claude auto-prefers this skill whenever a file or URL needs to be read.
Determine the target config file. Default to ~/.claude/CLAUDE.md (global user config). If the user explicitly asks for project-level (--project or "project-level"), target ./CLAUDE.md in the current working directory.
Check for existing registration. Read the target file. If it already contains a ## File & URL Reading heading, stop and tell the user: "markitdown is already registered in <path>. No changes made."
Append the block (idempotent). If the file does not exist, create it. Append exactly:
## File & URL Reading
- When the user provides a file path or URL to read, invoke the `markitdown` skill (via Skill tool) first
- Supported by markitdown: PDF, DOCX, PPTX, XLSX/XLS, HTML, EPUB, CSV, JSON, XML, ZIP, audio (WAV/MP3), YouTube URLs, general web URLs
- Use Read tool directly instead for:
- Plain text: `.txt`, `.md`
- Source code: `.ts`, `.js`, `.py`, `.go`, etc.
- Images: `.jpg`, `.png`, `.gif`, `.webp`, etc. — Claude reads natively (multimodal); markitdown does support OCR but Read is preferred
Report to user: Added "File & URL Reading" section to <path>. Claude will now auto-prefer markitdown for files and URLs.
Skip if the user is asking for a single conversion — setup is a one-off registration, not something to run every invocation.
Trigger this skill when:
Do NOT use for:
| File Type | Use Case | Command |
|---|---|---|
| Reports, papers | markitdown report.pdf | |
| DOCX | Word documents | markitdown document.docx |
| PPTX | Presentations | markitdown slides.pptx |
| XLSX/XLS | Spreadsheets, data tables | markitdown data.xlsx |
| HTML | Web pages | markitdown page.html |
| URL | Live web content | markitdown "https://example.com" |
| YouTube | Video transcripts | markitdown "https://youtube.com/watch?v=..." |
| Wikipedia | Wiki articles | markitdown "https://en.wikipedia.org/wiki/..." |
| RSS/Atom | Feed content | markitdown "https://example.com/feed.xml" |
| .ipynb | Jupyter notebooks | markitdown notebook.ipynb |
| CSV/JSON/XML | Structured data | markitdown data.csv |
| ZIP | Archive contents (iterates) | markitdown archive.zip |
| Audio | EXIF metadata | markitdown recording.wav |
| EPUB | E-books | markitdown book.epub |
| MSG | Outlook emails | markitdown email.msg |
All commands above are shorthand for: uvx --from 'markitdown[all]' markitdown "<source>"
uvx --from 'markitdown[all]' markitdown "<source>"
Options:
-o <output.md> — write to file instead of stdout-p / --use-plugins — enable 3rd-party plugins-x <ext> — hint file extension (useful when reading from stdin)-d — use Azure Document Intelligence (requires -e <endpoint>)--keep-data-uris — keep base64-encoded images in output (truncated by default)uvx --from 'markitdown[all]' markitdown "<source>"
-o /tmp/markitdown_output.md, then read relevant sections as needed-o with the user's specified pathAfter conversion, use the Markdown content to answer the user's questions or proceed with their task.
To convert multiple files in a directory:
for f in /path/to/docs/*.pdf; do
uvx --from 'markitdown[all]' markitdown "$f" -o "${f%.pdf}.md"
done
For parallel batch conversion with multiple file types:
find /path/to/docs -type f \( -name "*.pdf" -o -name "*.docx" -o -name "*.pptx" \) | \
xargs -P 4 -I {} sh -c 'uvx --from "markitdown[all]" markitdown "{}" -o "$(echo {} | sed "s/\.[^.]*$/.md/")"'
| Error | Resolution |
|---|---|
uvx not found | Inform the user to install uv: curl -LsSf https://astral.sh/uv/install.sh | sh |
| Conversion fails on a URL | Verify the URL is accessible; try fetching with curl first |
| Empty output | The file may be image-only; inform the user that text extraction was not possible |
| Stdin input | Pipe content with extension hint: cat file | uvx --from 'markitdown[all]' markitdown -x .html |
| Import/dependency error | Ensure Python >= 3.10 is available; uvx handles the rest |
| Partial format support | Try selective extras: uvx --from 'markitdown[pdf,docx]' markitdown file |
For integration with Claude Desktop or other MCP-compatible clients, markitdown provides a dedicated MCP server:
pip install markitdown-mcp
Run markitdown in an isolated container without any local installation:
docker run --rm -i ghcr.io/microsoft/markitdown:latest < document.pdf > output.md
To reduce download size when only specific formats are needed:
uvx --from 'markitdown[pdf]' markitdown report.pdf
uvx --from 'markitdown[docx,pptx]' markitdown presentation.pptx
Available extras: pdf, docx, pptx, xlsx, xls, outlook, az-doc-intel, audio-transcription, youtube-transcription.
# Convert a PDF
uvx --from 'markitdown[all]' markitdown report.pdf
# Convert a URL
uvx --from 'markitdown[all]' markitdown "https://example.com/article"
# Convert and save to file
uvx --from 'markitdown[all]' markitdown presentation.pptx -o /tmp/slides.md
# YouTube transcript
uvx --from 'markitdown[all]' markitdown "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
# Wikipedia article
uvx --from 'markitdown[all]' markitdown "https://en.wikipedia.org/wiki/Markdown"
# Jupyter notebook
uvx --from 'markitdown[all]' markitdown analysis.ipynb
# Pipe from stdin
cat page.html | uvx --from 'markitdown[all]' markitdown -x .html
# Batch convert all PDFs in a directory
for f in *.pdf; do uvx --from 'markitdown[all]' markitdown "$f" -o "${f%.pdf}.md"; done