Extract text from images using GLM-OCR API. Supports images and PDFs with high accuracy OCR, table recognition, formula extraction, and handwriting recognition. Use this skill whenever the user wants to extract text from images, perform OCR on pictures, scan documents, convert images to text, or process any image files to get their textual content.
Extract text from images and PDFs using the GLM-OCR layout parsing API.
| Resource |
|---|
| Link |
|---|
| Get API Key | https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys |
| GitHub | https://github.com/zai-org/GLM-OCR |
ZHIPU_API_KEY (and optional timeout) is read from environment variables.⛔ MANDATORY RESTRICTIONS - DO NOT VIOLATE ⛔
python scripts/glm_ocr_cli.pypython scripts/config_setup.py setup --api-key YOUR_KEY
python scripts/glm_ocr_cli.py --file-url "URL provided by user"
python scripts/glm_ocr_cli.py --file /path/to/image.jpg
python scripts/glm_ocr_cli.py --file-url "URL" --output result.json
python {baseDir}/scripts/glm_ocr_cli.py (--file-url URL | --file PATH) [--output FILE] [--pretty]
| Parameter | Required | Description |
|---|---|---|
--file-url | One of | URL to image/PDF |
--file | One of | Local file path to image/PDF |
--output, -o | No | Save result JSON to file |
--pretty | No | Pretty-print JSON output |
{
"ok": true,
"text": "# Extracted text in Markdown...",
"layout_details": [[...]],
"result": { "raw_api_response": "..." },
"error": null,
"source": "/path/to/file.jpg",
"source_type": "file"
}
Key fields:
ok — whether extraction succeededtext — extracted text in Markdown (use this for display)layout_details — layout analysis detailsresult — raw API responseerror — error details on failureAPI key not configured:
Error: ZHIPU_API_KEY not configured. Get your API key at: https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys
→ Show exact error to user, guide them to configure
Authentication failed (401/403): API key invalid/expired → reconfigure
Rate limit (429): Quota exhausted → inform user to wait
File not found: Local file missing → check path
references/output_schema.md — detailed output format specification