This skill should be used when the user asks to "convert a document", "upload a document", "publish a document to Confluence", "convert a Word doc", "convert a Google Doc", "process a .docx file", "upload a spreadsheet", "upload an Excel file", "upload a Google Sheet", "upload a PDF", or mentions converting documents to ADF, markdown, or HTML. Handles structural analysis, formatting cleanup, template detection, title heading stripping (Confluence shows title separately), and conversion of .docx, Google Docs, Excel, Google Sheets, and PDFs to Confluence pages.
Convert .docx files, Google Docs, Excel/Google Sheets, and PDFs into Confluence pages with a consistent analysis-first workflow, formatting cleanup, and template auto-detection.
Before anything else, check the file type and branch:
| File type | Route |
|---|---|
.docx | Continue to Step 1 |
| Google Doc URL | Continue to Step 1 |
.xlsx / .xls / Google Sheets URL | Go to Spreadsheet Path below |
.pdf | Go to PDF Path below |
.md / .txt / .wiki | Continue to Step 1 (read directly) |
Ask the user:
Spreadsheet handling options:
1. Attach + embed viewer — uploads the file as an attachment and embeds it inline on a
Confluence page using the view-file macro. Preserves all tabs,
formulas, and formatting. (recommended)
2. Convert to ADF table — extracts data to a plain Confluence table. Loses formulas,
charts, and multi-tab structure.
Option 1 — Attach + embed viewer (preferred):
view-file macro:
<ac:structured-macro ac:name="view-file" ac:schema-version="1">
<ac:parameter ac:name="name"><ri:attachment ri:filename="FILENAME.xlsx" /></ac:parameter>
</ac:structured-macro>
Use POST /wiki/rest/api/content with body.storage. The v2/ADF API rejects extension nodes.POST /wiki/rest/api/content/{pageId}/child/attachment with header X-Atlassian-Token: no-check.Option 2 — Convert to ADF table:
Use pandas to read the first sheet, convert rows to ADF tableRow nodes (first row as tableHeader). Continue to Step 7.
For Google Sheets: Export as CSV for Option 2. For Option 1, ask the user to download as .xlsx first.
Ask the user:
PDF handling options:
1. Static record — upload the PDF as an attachment and create a Confluence page with an
embedded viewer. Content is not extracted. Best for signed documents,
certificates, or any PDF that must be preserved exactly as-is.
2. Convert content — extract text and publish as a structured Confluence page (uses
pdfplumber). Best for text-heavy PDFs like policies or procedures
that need to be searchable and editable in Confluence.
Option 1 — Static record: Same attach + view-file macro pattern as spreadsheets. Skip Steps 1–7.
Option 2 — Convert content: Use pdfplumber to extract text, then continue to Step 1.
Before doing anything else, ask the user what the desired output is:
Output options:
1. Confluence ADF — publish directly to a Confluence space (default)
2. Markdown — output a .md file
3. HTML — output an .html file
If the user has already stated the target (e.g. "publish to Confluence"), skip this step.
For .docx:
publish.py automatically detects whether doc-lint is installed. If it is, doc-lint's fix.py runs on the file first (applying its full rule set), then docx_to_adf() converts the pre-cleaned file. If doc-lint is not installed, docx_to_adf() applies its built-in cleanup rules directly during conversion. Either way, no manual action needed — the best available rules are used automatically.
For Google Docs: Export as HTML (not plain text) to preserve heading levels and inline formatting: