Name: Office Pdf
Author: clawuno

Search skills.../

Skill Content

Read, generate, and process PDF files. Covers: extracting text and tables from existing PDFs, converting Markdown to styled PDF, creating professional documents (reports, proposals, invoices), and filling PDF forms.

Task Routing

Task	Route	Method
Create new PDF from scratch	CREATE	Dual-engine: HTML cover + ReportLab body
Convert Markdown to PDF	MD-TO-PDF	Markdown → styled HTML → Chrome printToPDF
Read/extract text from PDF	READ	pdfplumber (tables), pypdf (text)
Fill PDF form fields	FILL	pypdf form filling
Reformat existing content as PDF	REFORMAT	Extract → restructure → CREATE

Route: CREATE

Dual-Engine Architecture

Part	Engine	Why
Cover page	HTML+CSS → Chrome/Playwright → PDF	CSS natively supports gradients, grids, blend modes, custom fonts — hard with PDF drawing APIs
Body	ReportLab	Stable paragraph flow, pagination, headers/footers, page numbers
Merge	pypdf	Combine cover.pdf + body.pdf into final.pdf

Type	Cover Style	Best For
Report	Bold title, date banner	Business reports, quarterly reviews
Proposal	Logo-centric, value proposition	Sales proposals, partnerships
Whitepaper	Academic, abstract box	Technical deep-dives
Manual	Section tabs, version badge	User guides, SOPs
Invoice	Compact header, line items	Billing, quotes
Certificate	Ornamental border, seal area	Awards, credentials
Newsletter	Multi-column, headline	Internal comms, updates
Resume	Sidebar + main content	CVs, professional profiles
Presentation	Full-bleed images, large type	Handouts, summaries
Contract	Numbered clauses, signature blocks	Legal, agreements
Research	Abstract + keywords + citation	Academic papers
Catalog	Grid layout, product cards	Product listings
Brochure	Tri-fold layout, visual-heavy	Marketing materials
Menu	Category sections, pricing	Restaurant, services
Letterhead	Header + footer branding	Official correspondence

Industry	Recommended Accent	Hex
Finance/Legal	Navy blue	`#1B365D`
Healthcare	Teal	`#008080`
Technology	Electric blue	`#0066FF`
Education	Forest green	`#228B22`
Creative	Coral	`#FF6B6B`
Government	Dark red	`#8B0000`
Consulting	Charcoal + gold	`#333333` / `#D4AF37`

Block	Description
heading	Section/subsection titles (H1-H4)
paragraph	Body text with optional indent
bullet_list	Unordered list items
numbered_list	Ordered list items
table	Row/column data with header row
image	Embedded image with caption
code_block	Monospace code with syntax highlighting
blockquote	Indented quotation with attribution
callout	Highlighted box (info, warning, tip)
chart	Bar, line, pie rendered as image
math	LaTeX formula rendered as image
page_break	Force new page
horizontal_rule	Section separator
footnote	Numbered reference at page bottom

python3 scripts/md2pdf.py --input report.md --output report.pdf --title "My Report" --theme corporate

Theme	Personality
`corporate`	Professional blue, clean lines
`academic`	Formal, serif, dense
`modern`	Bold gradients, sans-serif
`minimal`	White space, understated
`creative`	Vibrant accents, playful
`nature`	Earth tones, organic
`elegant`	Serif, gold accents
`tech`	Dark theme, monospace accents
`magazine`	Editorial layout, pull quotes
`chinese_formal`	GB/T compliant, SimHei/FangSong

python3 md2pdf.py \
  --input report.md \
  --output report.pdf \
  --title "Quarterly Report" \
  --subtitle "Q1 2026" \
  --author "Research Team" \
  --theme corporate \
  --toc \                    # Generate table of contents
  --watermark "DRAFT" \      # Add watermark
  --header-title "Q1 Report" # Running header

# Text extraction
from pypdf import PdfReader
reader = PdfReader("file.pdf")
for page in reader.pages:
    print(page.extract_text())

# Table extraction
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
    for page in pdf.pages:
        tables = page.extract_tables()
        for table in tables:
            for row in table:
                print(row)

from pypdf import PdfReader, PdfWriter

reader = PdfReader("form.pdf")
writer = PdfWriter()
writer.append(reader)

# Fill form fields
writer.update_page_form_field_values(
    writer.pages[0],
    {"field_name": "value", "date_field": "2026-04-01"}
)

with open("filled.pdf", "wb") as f:
    writer.write(f)

Element	Font	Size	Weight
Title	Sans-serif (Helvetica/Arial)	24-32pt	Bold
Subtitle	Sans-serif	14-18pt	Regular
H1	Sans-serif	18-20pt	Bold
H2	Sans-serif	14-16pt	Bold
Body	Serif (Times/Georgia) or Sans	10-12pt	Regular
Caption	Sans-serif	8-9pt	Italic
Code	Monospace (Courier)	9-10pt	Regular
Footer	Sans-serif	8pt	Regular

# PDF → PNG pages for visual review
pdftoppm -png -r 150 output.pdf /tmp/preview/page

Pitfall	Solution
Fonts display wrong on recipient's machine	Embed fonts in ReportLab: `pdfmetrics.registerFont(TTFont(...))`
Tables split mid-row across pages	Use ReportLab `KeepTogether` or `splitFirst`/`splitLast`
Images blurry	Use original resolution; don't downscale before embedding
Cover CSS renders differently	Test with same Chrome version used for production
Chinese/Japanese text missing	Register CJK fonts (SimSun, MS Gothic) with ReportLab
Large file size	Compress images before embedding; use JPEG for photos, PNG for diagrams

Script	Purpose
`scripts/palette.py`	Generate design tokens (colors, fonts) for a document type
`scripts/cover.py`	Generate HTML cover page from design tokens
`scripts/render_cover.js`	Render HTML cover to PDF via Chrome/Playwright
`scripts/render_body.py`	Generate body PDF via ReportLab from content blocks
`scripts/merge.py`	Merge cover.pdf + body.pdf into final.pdf
`scripts/fill_inspect.py`	Inspect PDF form field names and types
`scripts/fill_write.py`	Fill PDF form fields programmatically
`scripts/reformat_parse.py`	Extract structure from existing PDF for reformatting
`scripts/make.sh`	End-to-end pipeline: palette → cover → body → merge
`scripts/md2pdf.py`	Convert Markdown to typeset PDF via ReportLab (10 themes, CJK, TOC, bookmarks, zero browser deps)

Office Pdf | Skills Pool

Office Pdf

Office Pdf

Task Routing

Route: CREATE

Dual-Engine Architecture

Workflow

15 Document Types

Accent Color Selection by Industry

Content Block Types

Route: MD-TO-PDF

Features

Available Themes

Key Options

Dependencies

Route: READ

Route: FILL (Forms)

Typography Standards

Validation Checklist

Preview Command

Common Pitfalls

Bundled Resources

Scripts

Design System

Dependencies

Attribution

Feishu Doc

Summarize

Nano Pdf

Diffs

Customs Trade Compliance

Nutrient Document Processing