Skip to content

Search skills.../

Agent Skill Search Engine

Search

Search
Categories
Occupations

About

About
Privacy
Terms

© 2026 Skills Pool. All rights reserved.

Ocr Image To Markdown | Skills Pool

Skill File

Ocr Image To Markdown

鉴于本地 OCR 工具的缺失，本技能利用 Agent 的多模态能力来查看图像（PNG, JPG 等）并将内容（文本、表格、逻辑图）转录为格式化的 Markdown。

lzfxxx0 starsJan 30, 2026

Occupation: Data Entry Keyers
Categories: Documents

Skill Content

OCR 图像识别转 Markdown

本技能允许你“阅读”图片并将内容转换为可编辑的 Markdown 文本。这在提取数据表格、幻灯片内容或文档截图时特别有用，尤其是当无法使用外部 OCR 库时。

使用指南

确认目标图片:
- 定位你需要处理的图片文件。
- 如有需要，使用 list_dir 浏览目录。
查看图片:
- 使用 view_file 工具来“看”图片内容。系统允许你直接处理图像数据。
- 关键: 你必须对图片路径使用 view_file，这样你的视觉模型才能消化它。
转录内容:
- 基于你所看到的，将文本转录为 Markdown。
- 表格: 将视觉看到的表格转换为标准 Markdown 表格 (| 表头 | ... |)。
- 标题: 使用 #, ## 等来标记图片中的标题，保持层级结构。
- 文本: 将段落转录为普通文本。
- 数字: 仔细核对所有数字，特别是财务报表中的数据。
保存输出:
- 使用 write_to_file 将转录的内容写入文件（例如）。

Related Skills

Quick Install

Ocr Image To Markdown

npx skillvault add lzfxxx/lzfxxx-ppt-master-agent-skills-ocr-image-to-markdown-skill-md

Download Skill Open repository

Author: lzfxxx
stars: 0
Updated: Jan 30, 2026
Occupation

On this page

01OCR 图像识别转 Markdown

.md

ocr_results.md

如果处理多张图片，考虑将其追加到同一个文件中，或按逻辑组织。

最佳实践技巧

表格: 仔细对齐行和列。标准 Markdown 表格不支持单元格合并（rowspan/colspan）。你需要根据逻辑流将合并的单元格展开，或者留空。
复杂布局: 如果图片布局复杂（例如左右分栏），请按照逻辑阅读顺序（从上到下，从左到右）将其序列化。
图表/图形: 如果图片包含图表，请描述趋势，或者将可见的数据点提取为列表或表格。
无需代码执行: 不要试图编写或使用 Python 库（如 pytesseract, easyocr, PIL）来进行文本提取。请直接利用你自身的视觉能力。

示例场景

请求: "把这 3 张财务报告的截图转为 markdown。"

执行:

list_dir 查看文件: img1.png, img2.png, img3.png。
view_file 读取 img1.png。
(内部处理): 识别表头 "Q1 Revenue" 和表格行数据。
view_file 读取 img2.png 和 img3.png。
write_to_file 创建 financial_report.md 并写入汇总的内容。

02

使用指南

03最佳实践技巧

Feishu Doc

Feishu document read/write operations. Activate when user mentions Feishu docs, cloud docs, or docx links.

Summarize

Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).

Nano Pdf

Edit PDFs with natural-language instructions using the nano-pdf CLI.

Diffs

Use the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.

Customs Trade Compliance

海关文件、关税分类、关税优化、受限方筛查以及多司法管辖区法规合规的编码化专业知识。由拥有15年以上经验的贸易合规专家提供。包括HS分类逻辑、Incoterms应用、自贸协定利用以及罚款减免。适用于处理海关清关、关税分类、贸易合规、进出口文件或关税优化时使用。license: Apache-2.0

Nutrient Document Processing

Nutrient DWS API を使用してドキュメントの処理、変換、OCR、抽出、編集、署名、フォーム入力を行います。PDF、DOCX、XLSX、PPTX、HTML、画像に対応しています。

Data Entry Keyers