Use when the task involves reading, creating, or editing `.docx` documents, especially when formatting or layout fidelity matters; prefer `python-docx` plus the bundled `scripts/render_docx.py` for visual checks.
soffice and pdftoppm to convert DOCX -> PDF -> PNGs.scripts/render_docx.py.python-docx for edits and structured creation (headings, styles, tables, lists).python-docx as a fallback and call out layout risk.tmp/docs/ for intermediate files; delete when done.output/doc/ when working in this repo.Prefer uv for dependency management. Set up a venv before installing packages.
Python packages:
uv pip install python-docx pdf2image
No required environment variables.
DOCX -> PDF:
soffice -env:UserInstallation=file:///tmp/lo_profile_$$ --headless --convert-to pdf --outdir $OUTDIR $INPUT_DOCX
PDF -> PNGs:
pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME
Bundled helper:
python3 scripts/render_docx.py /path/to/file.docx --output_dir /tmp/docx_pages