Convert local PDF and image files into markdown using Mistral OCR. Use when an agent needs reliable document text extraction for downstream analysis, summarization, chunking, indexing, or QA workflows where the source input is a PDF/image file and the output must be markdown.
Use the bundled TypeScript CLI to convert one local PDF/image file into one markdown file via Mistral OCR.
Use npm from scripts/.
npm install
npm run build
node dist/cli.js <input-file> [--output <output-md>]
Examples:
node dist/cli.js ../sample_files/example.pdf
node dist/cli.js /any/path/to/example.pdf --output-dir ../out
node dist/cli.js ../sample_files/example.pdf --output ../out/example.md
node dist/cli.js ../sample_files/example.png --pages 0,1,2
Provide Mistral API key in this precedence order:
--api-key <key>MISTRAL_API_KEY in environment.env file (auto-loaded from --env-file or nearest .env in current/parent directories)Prefer MISTRAL_API_KEY in environment for CI/automation and .env for local development.
Write one markdown file.
./ocr-output/<input-basename>.md (relative to current working directory).--output for an exact file path or --output-dir for a custom output folder.--model <name>: OCR model (default mistral-ocr-latest)--output-dir <path>: output directory when --output is not specified--pages <csv>: zero-based page indexes--keep-uploaded-file: skip cleanup of uploaded file in Mistral storage--include-image-base64: request image payloads from OCR responseRead references/mistral-ocr-api.md when changing OCR request fields or response handling.