Glmocr | Skills Pool

技能档案

Glmocr

Extract text from images using GLM-OCR API. Supports images and PDFs with high accuracy OCR, table recognition, formula extraction, and handwriting recognition. Use this skill whenever the user wants to extract text from images, perform OCR on pictures, scan documents, convert images to text, or process any image files to get their textual content.

zai-org327 星标2026年4月15日

职业
分类: 文档

技能内容

GLM-OCR Text Extraction Skill

Extract text from images and PDFs using the GLM-OCR layout parsing API.

When to Use

Extract text from images (PNG, JPG, PDF)
Convert screenshots to text
Process scanned documents
OCR photos containing text (including handwritten text)
Recognize tables and formulas in documents
User mentions "OCR", "文字识别", "文档解析"

Key Features

Table recognition: Detects and converts tables to Markdown format
Formula extraction: LaTeX format output
Handwriting support: Strong recognition for handwritten text
Local file & URL: Supports both local files and remote URLs

Resource Links

相关技能

python scripts/glm_ocr_cli.py --file-url "URL provided by user"

python scripts/glm_ocr_cli.py --file /path/to/image.jpg

python scripts/glm_ocr_cli.py --file-url "URL" --output result.json

python {baseDir}/scripts/glm_ocr_cli.py (--file-url URL | --file PATH) [--output FILE] [--pretty]

{
  "ok": true,
  "text": "# Extracted text in Markdown...",
  "layout_details": [[...]],
  "result": { "raw_api_response": "..." },
  "error": null,
  "source": "/path/to/file.jpg",
  "source_type": "file"
}

Error: ZHIPU_API_KEY not configured. Get your API key at: https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys