Convert HWP / HWPX / HWPML files to Markdown using kordoc. Extracts text, headings, tables, lists, images, footnotes, and hyperlinks. Use for Korean word processor files (Hangul), government documents, and AI-ready data preparation.
.hwp, .hwpx, .hwpml) to Markdownoma-pdf (OCR + Tagged PDF specialization)oma-docs)bunx kordoc@latest to run — no installation required. Always pass (or a pinned version) to avoid using a stale bunx cache@latestresources/flatten-tables.ts to (a) convert HTML <table> blocks into GFM pipe tables and (b) strip Private Use Area characters (Hancom font-specific glyphs that render as blanks without the Hancom font). Merged-cell fidelity is traded for pure-Markdown output — this is the accepted defaultFollow resources/execution-protocol.md step by step.
Without
-oor-d, kordoc prints Markdown to stdout. Always pass an output target to write a file.
# Given input.hwp at /path/to/input.hwp, write /path/to/input.md
bunx kordoc@latest /path/to/input.hwp -o /path/to/input.md
bunx kordoc@latest input.hwp
bunx kordoc@latest input.hwpx
bunx kordoc@latest *.hwp -d ./out/
bunx kordoc@latest input.hwp -p 1-5
bunx kordoc@latest input.hwp -p 1,3,5
bunx kordoc@latest input.hwp --format json
bunx kordoc@latest input.hwp --silent
| Flag | Description |
|---|---|
-o, --output <path> | Output file path (single-file mode) |
-d, --out-dir <dir> | Output directory (multi-file mode) |
-p, --pages <range> | Page/section range (e.g., 1-3, 1,3,5) |
--format <type> | markdown (default) or json |
--no-header-footer | Strip PDF headers/footers (PDF only) |
--silent | Suppress progress messages |
-V, --version / -h, --help | Standard |
This skill passes kordoc flags through as-is — no translation layer.
| Format | Extension | Notes |
|---|---|---|
| HWP 5.x binary | .hwp | Full support (incl. DRM-locked via kordoc's rhwp-algorithm port) |
| HWPX | .hwpx | Full support incl. nested tables, merged cells |
| HWPML | .hwp (XML variant) | Auto-detected by signature |
kordoc also parses PDF / XLSX / DOCX. Those are intentionally outside this skill's scope — see "When NOT to use".
Project-specific settings: config/hwp-config.yaml
See resources/troubleshooting.md.
resources/execution-protocol.mdresources/troubleshooting.mdconfig/hwp-config.yaml../oma-pdf/SKILL.md (use for .pdf inputs)