Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of "Word doc", "word document", ".docx", or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a "report", "memo", "letter", "template", or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation.
A .docx file is a ZIP archive containing XML files.
| Task | Approach |
|---|---|
| Read/analyze content | pandoc or unpack for raw XML |
| Create new document | Use docx-js - see Creating New Documents below |
| Edit existing document | Unpack → edit XML → repack - see Editing Existing Documents below |
Legacy .doc files must be converted before editing:
python scripts/office/soffice.py --headless --convert-to docx document.doc
# Text extraction with tracked changes
pandoc --track-changes=all document.docx -o output.md
# Raw XML access
python scripts/office/unpack.py document.docx unpacked/
python scripts/office/soffice.py --headless --convert-to pdf document.docx
pdftoppm -jpeg -r 150 document.pdf page
To produce a clean document with all tracked changes accepted (requires LibreOffice):
python scripts/accept_changes.py input.docx output.docx
When asked to analyze a .docx file and produce a markdown analysis output, follow this comprehensive workflow:
Use pandoc to extract text and tracked changes from the .docx file:
pandoc --track-changes=all document.docx -o output.md
Read the extracted markdown file using the read_file tool to understand:
Determine what type of document this is:
Write a detailed markdown analysis that includes:
Document Overview
Structure Analysis
Content Summary
Technical Details
Key Insights
References and Citations
Use the write_file tool to write the complete analysis to the specified markdown file. Ensure the analysis is:
Important: Always complete ALL steps of this workflow. Do not stop after just extracting content with pandoc. The analysis must be comprehensive and detailed, covering all aspects of the document.
Generate .docx files with JavaScript, then validate. Install: npm install -g docx
const { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell, ImageRun,
Header, Footer, AlignmentType, PageOrientation, LevelFormat, ExternalHyperlink,
TableOfContents, HeadingLevel, BorderStyle, WidthType, ShadingType,
VerticalAlign, PageNumber, PageBreak } = require('docx');
const doc = new Document({ sections: [{ children: [/* content */] }] });
Packer.toBuffer(doc).then(buffer => fs.writeFileSync("doc.docx", buffer));
After creating the file, validate it. If validation fails, unpack, fix the XML, and repack.
python scripts/office/validate.py doc.docx
// CRITICAL: docx-js defaults to A4, not US Letter
// Always set page size explicitly for consistent results