Name: Pdf
Author: jitenkr2030

搵技能.../

Pdf | Skills Pool

Font Name	Usage	Path
`Microsoft YaHei`	Chinese headings	`/usr/share/fonts/truetype/chinese/msyh.ttf`
`SimHei`	Chinese body text	`/usr/share/fonts/truetype/chinese/SimHei.ttf`
`SarasaMonoSC`	Chinese code blocks	`/usr/share/fonts/truetype/chinese/SarasaMonoSC-Regular.ttf`
`Times New Roman`	English text, numbers, tables	`/usr/share/fonts/truetype/english/Times-New-Roman.ttf`
`Calibri`	English alternative	`/usr/share/fonts/truetype/english/calibri-regular.ttf`
`DejaVuSans`	Formulas, symbols, code	`/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf`

from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
from reportlab.pdfbase.pdfmetrics import registerFontFamily

# Chinese fonts
pdfmetrics.registerFont(TTFont('Microsoft YaHei', '/usr/share/fonts/truetype/chinese/msyh.ttf'))
pdfmetrics.registerFont(TTFont('SimHei', '/usr/share/fonts/truetype/chinese/SimHei.ttf'))
pdfmetrics.registerFont(TTFont("SarasaMonoSC", '/usr/share/fonts/truetype/chinese/SarasaMonoSC-Regular.ttf'))

# English fonts
pdfmetrics.registerFont(TTFont('Times New Roman', '/usr/share/fonts/truetype/english/Times-New-Roman.ttf'))
pdfmetrics.registerFont(TTFont('Calibri', '/usr/share/fonts/truetype/english/calibri-regular.ttf'))

# Symbol/Formula font
pdfmetrics.registerFont(TTFont("DejaVuSans", '/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf'))

# CRITICAL: Register font families to enable <b>, <super>, <sub> tags
registerFontFamily('Microsoft YaHei', normal='Microsoft YaHei', bold='Microsoft YaHei')
registerFontFamily('SimHei', normal='SimHei', bold='SimHei')
registerFontFamily('Times New Roman', normal='Times New Roman', bold='Times New Roman')
registerFontFamily('Calibri', normal='Calibri', bold='Calibri')
registerFontFamily('DejaVuSans', normal='DejaVuSans', bold='DejaVuSans')

body_style = ParagraphStyle(
    name="BodyStyle",
    fontName="Times New Roman",
    fontSize=10.5,
    leading=18,
    alignment=TA_JUSTIFY,
)
story.append(Paragraph("My name is Lei Shen (<font name='SimHei'>沈磊</font>)", body_style))

import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

# Cover page styles
cover_title_style = ParagraphStyle(
    name='CoverTitle',
    fontName='Microsoft YaHei',  # or 'Times New Roman' for English
    fontSize=42,
    leading=50,
    alignment=TA_CENTER,
    spaceAfter=36
)

cover_subtitle_style = ParagraphStyle(
    name='CoverSubtitle',
    fontName='SimHei',  # or 'Times New Roman' for English
    fontSize=20,
    leading=28,
    alignment=TA_CENTER,
    spaceAfter=48
)

cover_author_style = ParagraphStyle(
    name='CoverAuthor',
    fontName='SimHei',  # or 'Times New Roman' for English
    fontSize=14,
    leading=22,
    alignment=TA_CENTER,
    spaceAfter=18
)

# Cover page construction
story.append(Spacer(1, 120))  # Push down from top
story.append(Paragraph("报告主标题", cover_title_style))
story.append(Spacer(1, 36))
story.append(Paragraph("副标题或说明文字", cover_subtitle_style))
story.append(Spacer(1, 48))
story.append(Paragraph("作者姓名", cover_author_style))
story.append(Paragraph("所属机构", cover_author_style))
story.append(Spacer(1, 60))
story.append(Paragraph("2025年2月", cover_author_style))
story.append(PageBreak())  # Always page break after cover

TABLE_HEADER_COLOR = colors.HexColor('#1F4E79')  # Dark blue for header
TABLE_HEADER_TEXT = colors.white                  # White text for header
TABLE_ROW_EVEN = colors.white                     # White for even rows
TABLE_ROW_ODD = colors.HexColor('#F5F5F5')        # Light gray for odd rows

# Define styles
header_style = ParagraphStyle(
    name='TableHeader',
    fontName='Times New Roman',
    fontSize=11,
    textColor=colors.white,
    alignment=TA_CENTER
)

cell_style = ParagraphStyle(
    name='TableCell',
    fontName='Times New Roman',
    fontSize=10,
    textColor=colors.black,
    alignment=TA_CENTER
)

# For Chinese tables, add wordWrap="CJK"
tbl_center = ParagraphStyle(
    "tbl_center",
    fontName="SimHei",
    fontSize=9,
    leading=12,
    alignment=TA_CENTER,
    wordWrap="CJK",
)

# ✅ CORRECT: All table text content wrapped in Paragraph()
data = [
    [Paragraph('<b>Parameter</b>', header_style), Paragraph('<b>Unit</b>', header_style), Paragraph('<b>Value</b>', header_style)],
    [Paragraph('Temperature', cell_style), Paragraph('°C', cell_style), Paragraph('25.5', cell_style)],
    [Paragraph('Pressure', cell_style), Paragraph('Pa', cell_style), Paragraph('1.01 x 10<super>5</super>', cell_style)],
    [Paragraph('Density', cell_style), Paragraph('kg/m<super>3</super>', cell_style), Paragraph('1.225', cell_style)],
    [Paragraph('H<sub>2</sub>O Content', cell_style), Paragraph('%', cell_style), Paragraph('45.2', cell_style)],
]

# ❌ PROHIBITED: Plain strings
# data = [['<b>Header</b>', 'Value'], ['Pressure', '1.01 x 10<super>5</super>']]

table = Table(data, colWidths=[120, 80, 100])
table.setStyle(TableStyle([
    ('BACKGROUND', (0, 0), (-1, 0), colors.HexColor('#1F4E79')),
    ('TEXTCOLOR', (0, 0), (-1, 0), colors.white),
    ('BACKGROUND', (0, 1), (-1, 1), colors.white),
    ('BACKGROUND', (0, 2), (-1, 2), colors.HexColor('#F5F5F5')),
    ('GRID', (0, 0), (-1, -1), 0.5, colors.grey),
    ('VALIGN', (0, 0), (-1, -1), 'MIDDLE'),
    ('LEFTPADDING', (0, 0), (-1, -1), 8),
    ('RIGHTPADDING', (0, 0), (-1, -1), 8),
    ('TOPPADDING', (0, 0), (-1, -1), 6),
    ('BOTTOMPADDING', (0, 0), (-1, -1), 6),
]))

from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate
import os

pdf_filename = "financial_report_2024.pdf"
title_for_metadata = os.path.splitext(pdf_filename)[0]

doc = SimpleDocTemplate(
    pdf_filename,
    pagesize=letter,
    title=title_for_metadata,
    author='Z.ai',
    creator='Z.ai',
    subject='Annual financial analysis and performance metrics'
)

from pypdf import PdfReader, PdfWriter
import os

pdf_filename = "output.pdf"
title_for_metadata = os.path.splitext(os.path.basename(pdf_filename))[0]

reader = PdfReader("input.pdf")
writer = PdfWriter()

for page in reader.pages:
    writer.add_page(page)

# Set metadata: Title, Author, Creator are REQUIRED
writer.add_metadata({
    '/Title': title_for_metadata,  # MUST: Match filename
    '/Author': 'Z.ai',             # MUST: Set to "Z.ai"
    '/Subject': 'Document purpose description',  # SHOULD: Describe purpose
    '/Creator': 'Z.ai'             # MUST: Set to "Z.ai"
})

with open(pdf_filename, "wb") as output:
    writer.write(output)

doc.build(story)
print("PDF built")

python scripts/add_zai_metadata.py output.pdf

print("✓ PDF generated with Z.ai metadata")

from pypdf import PdfReader, PdfWriter

# Read a PDF
reader = PdfReader("document.pdf")
print(f"Pages: {len(reader.pages)}")

# Extract text
text = ""
for page in reader.pages:
    text += page.extract_text()

from pypdf import PdfWriter, PdfReader

writer = PdfWriter()
for pdf_file in ["doc1.pdf", "doc2.pdf", "doc3.pdf"]:
    reader = PdfReader(pdf_file)
    for page in reader.pages:
        writer.add_page(page)

with open("merged.pdf", "wb") as output:
    writer.write(output)

reader = PdfReader("input.pdf")
for i, page in enumerate(reader.pages):
    writer = PdfWriter()
    writer.add_page(page)
    with open(f"page_{i+1}.pdf", "wb") as output:
        writer.write(output)

reader = PdfReader("document.pdf")
meta = reader.metadata
print(f"Title: {meta.title}")
print(f"Author: {meta.author}")
print(f"Subject: {meta.subject}")
print(f"Creator: {meta.creator}")

# Add metadata to a single PDF (in-place)
python scripts/add_zai_metadata.py document.pdf

# Add metadata with custom title
python scripts/add_zai_metadata.py report.pdf -t "Q4 Financial Analysis"

# Batch process multiple PDFs
python scripts/add_zai_metadata.py *.pdf

reader = PdfReader("input.pdf")
writer = PdfWriter()

page = reader.pages[0]
page.rotate(90)  # Rotate 90 degrees clockwise
writer.add_page(page)

with open("rotated.pdf", "wb") as output:
    writer.write(output)

import pdfplumber

with pdfplumber.open("document.pdf") as pdf:
    for page in pdf.pages:
        text = page.extract_text()
        print(text)

with pdfplumber.open("document.pdf") as pdf:
    for i, page in enumerate(pdf.pages):
        tables = page.extract_tables()
        for j, table in enumerate(tables):
            print(f"Table {j+1} on page {i+1}:")
            for row in table:
                print(row)

Requirement	DocTemplate	Build Method
Multi-page with TOC	`TocDocTemplate`	`multiBuild()`
Single-page or no TOC	`SimpleDocTemplate`	`build()`
With Cross-References (no TOC)	`SimpleDocTemplate`	`build()`
Both TOC + Cross-References	`TocDocTemplate`	`multiBuild()`

from reportlab.platypus import Paragraph
from reportlab.lib.styles import getSampleStyleSheet

styles = getSampleStyleSheet()

# RIGHT: Use HTML-style tags in Paragraph objects with letter 'x' for multiplication
sci_notation = Paragraph("-1.246 x 10<super>8</super>", styles['Normal'])
chemical = Paragraph("H<sub>2</sub>O", styles['Normal'])
unit = Paragraph("W/m<super>2</super>", styles['Normal'])

# PROHIBITED: "K.G. Palepu" may break after "K.G."
text = Paragraph("Professors (K.G. Palepu) proposed...",style)

# RIGHT: Use non-breaking space (U+00A0) to prevent breaking
text = Paragraph("Professors (K.G.\u00A0Palepu) proposed...",style)

# RIGHT: Add wordWrap='CJK' for proper typography
styles.add(ParagraphStyle(
    name='BodyStyle',
    fontName='SimHei',
    fontSize=10.5,
    leading=18,
    alignment=TA_LEFT,
    wordWrap='CJK'  # Prevents orphaned punctuation
))

# PROHIBITED: Normal newline character does NOT create line breaks
text = Paragraph("Line 1\nLine 2\nLine 3", style)  # Will render as single line!

# RIGHT: Use <br/> tag for line breaks
text = Paragraph("Line 1<br/>Line 2<br/>Line 3", style)

# Alternative: Split into multiple Paragraph objects
story.append(Paragraph("Line 1", style))
story.append(Paragraph("Line 2", style))
story.append(Paragraph("Line 3", style))

from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

c = canvas.Canvas("hello.pdf", pagesize=letter)
width, height = letter

# Add text
c.drawString(100, height - 100, "Hello World!")
c.drawString(100, height - 120, "This is a PDF created with reportlab")

# Add a line
c.line(100, height - 140, 400, height - 140)

# Save
c.save()

# ❌ PROHIBIT - DO NOT USE
toc_entries = [("1. Title", "5"), ("2. Section", "10")]
for entry, page in toc_entries:
    story.append(Paragraph(f"{entry} {'.'*50} {page}", style))

def add_heading(text, style, level=0):
    """Create heading with bookmark for auto-TOC"""
    p = Paragraph(text, style)
    p.bookmark_name = text
    p.bookmark_level = level
    p.bookmark_text = text
    return p

# Usage:
story.append(add_heading("1. Introduction", styles['Heading1'], 0))
story.append(Paragraph('Content...', styles['Normal']))

from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, PageBreak, Spacer
from reportlab.platypus.tableofcontents import TableOfContents
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import inch

class TocDocTemplate(SimpleDocTemplate):
    def __init__(self, *args, **kwargs):
        SimpleDocTemplate.__init__(self, *args, **kwargs)

    def afterFlowable(self, flowable):
        """Capture TOC entries after each flowable is rendered"""
        if hasattr(flowable, 'bookmark_name'):
            level = getattr(flowable, 'bookmark_level', 0)
            text = getattr(flowable, 'bookmark_text', '')
            self.notify('TOCEntry', (level, text, self.page))

# Create document
doc = TocDocTemplate("document.pdf", pagesize=letter)
story = []
styles = getSampleStyleSheet()

# Create Table of Contents
toc = TableOfContents()
toc.levelStyles = [
    ParagraphStyle(name='TOCHeading1', fontSize=14, leftIndent=20,
                   fontName='Times New Roman'),
    ParagraphStyle(name='TOCHeading2', fontSize=12, leftIndent=40,
                   fontName='Times New Roman'),
]
story.append(Paragraph("<b>Table of Contents</b>", styles['Title']))
story.append(Spacer(1, 0.2*inch))
story.append(toc)
story.append(PageBreak())

# Helper function: Create heading with TOC bookmark
def add_heading(text, style, level=0):
    p = Paragraph(text, style)
    p.bookmark_name = text
    p.bookmark_level = level
    p.bookmark_text = text
    return p

# Chapter 1: Introduction
story.append(add_heading("Chapter 1: Introduction", styles['Heading1'], 0))
story.append(Paragraph("This is the introduction chapter with some example content.",
                       styles['Normal']))
story.append(Spacer(1, 0.2*inch))

story.append(add_heading("1.1 Background", styles['Heading2'], 1))
story.append(Paragraph("Background information goes here.", styles['Normal']))


# Chapter 2: Conclusion
story.append(add_heading("Chapter 2: Conclusion", styles['Heading1'], 0))
story.append(Paragraph("This concludes our document.", styles['Normal']))
story.append(Spacer(1, 0.2*inch))

story.append(add_heading("2.1 Summary", styles['Heading2'], 1))
story.append(Paragraph("Summary of the document.", styles['Normal']))

# Build the document (must use multiBuild for TOC to work)
doc.multiBuild(story)

print("PDF with Table of Contents created successfully!")

from reportlab.lib.pagesizes import letter
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.enums import TA_CENTER
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak
from reportlab.lib import colors
from reportlab.platypus import Table, TableStyle


class CrossReferenceDocument:
    """Manages cross-references throughout the document"""

    def __init__(self):
        self.figures = {}
        self.tables = {}
        self.refs = {}
        self.figure_counter = 0
        self.table_counter = 0
        self.ref_counter = 0

    def add_figure(self, name):
        """Add a figure and return its number"""
        if name not in self.figures:
            self.figure_counter += 1
            self.figures[name] = self.figure_counter
        return self.figures[name]

    def add_table(self, name):
        """Add a table and return its number"""
        if name not in self.tables:
            self.table_counter += 1
            self.tables[name] = self.table_counter
        return self.tables[name]

    def add_reference(self, name):
        """Add a reference and return its number"""
        if name not in self.refs:
            self.ref_counter += 1
            self.refs[name] = self.ref_counter
        return self.refs[name]


def build_document():
    doc = SimpleDocTemplate("cross_ref.pdf", pagesize=letter)
    xref = CrossReferenceDocument()
    styles = getSampleStyleSheet()

    # Caption style
    styles.add(ParagraphStyle(
        name='Caption',
        parent=styles['Normal'],
        alignment=TA_CENTER,
        fontSize=10,
        textColor=colors.HexColor('#333333')
    ))

    story = []

    # Step 1: Register all figures, tables, and references FIRST
    fig1 = xref.add_figure('sample')
    table1 = xref.add_table('data')
    ref1 = xref.add_reference('author2024')

    # Step 2: Use them in text
    intro = f"""
    See Figure {fig1} for details and Table {table1} for data<sup>[{ref1}]</sup>.
    """
    story.append(Paragraph(intro, styles['Normal']))
    story.append(Spacer(1, 0.2*inch))

    # Step 3: Create figures and tables with numbered captions
    story.append(Paragraph(f"<b>Figure {fig1}.</b> Sample Figure Caption",
        styles['Caption']
    ))

    # Table example
    header_style = ParagraphStyle(
    name='TableHeader',
    fontName='Times New Roman',
    fontSize=11,
    textColor=colors.white,
    alignment=TA_CENTER
    )

    cell_style = ParagraphStyle(
        name='TableCell',
        fontName='Times New Roman',
        fontSize=10,
        textColor=colors.black,
        alignment=TA_CENTER
    )

    # All text content wrapped in Paragraph() 
    data = [
        [Paragraph('<b>Item</b>', header_style), Paragraph('<b>Value</b>', header_style)],
        [Paragraph('A', cell_style), Paragraph('10', cell_style)],
        [Paragraph('B', cell_style), Paragraph('20', cell_style)],
    ]
    t = Table(data, colWidths=[2*inch, 2*inch])
    t.setStyle(TableStyle([
        ('BACKGROUND', (0, 0), (-1, 0), colors.HexColor('#1F4E79')),
        ('TEXTCOLOR', (0, 0), (-1, 0), colors.white),
        ('ALIGN', (0, 0), (-1, -1), 'CENTER'),
        ('GRID', (0, 0), (-1, -1), 0.5, colors.grey),
    ]))
    story.append(t)
    story.append(Spacer(1, 6))
    story.append(Paragraph(f"<b>Table {table1}.</b> Sample Data Table",
        styles['Caption']
    ))

    story.append(PageBreak())

    # Step 4: Reference again in discussion
    discussion = f"""
    As shown in Figure {fig1} and Table {table1}, results are clear<sup>[{ref1}]</sup>.
    """
    story.append(Paragraph(discussion, styles['Normal']))

    # Step 5: Bibliography section
    story.append(PageBreak())
    story.append(Paragraph("<b>References</b>", styles['Heading1']))
    story.append(Paragraph(
        f"[{ref1}] Author, A. (2024). Example Reference. <i>Journal Name</i>.",
        styles['Normal']
    ))

    doc.build(story)
    print("PDF with cross-references created!")


if __name__ == '__main__':
    build_document()

# Extract text
pdftotext input.pdf output.txt

# Extract text preserving layout
pdftotext -layout input.pdf output.txt

# Extract specific pages
pdftotext -f 1 -l 5 input.pdf output.txt  # Pages 1-5

# Merge PDFs
qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf

# Split pages
qpdf input.pdf --pages . 1-5 -- pages1-5.pdf
qpdf input.pdf --pages . 6-10 -- pages6-10.pdf

# Rotate pages
qpdf input.pdf output.pdf --rotate=+90:1  # Rotate page 1 by 90 degrees

# Remove password
qpdf --password=mypassword --decrypt encrypted.pdf decrypted.pdf

# Add metadata to a single PDF (in-place)
python scripts/add_zai_metadata.py document.pdf

# Add metadata to a single PDF (create new file)
python scripts/add_zai_metadata.py input.pdf -o output.pdf

# Add metadata with custom title
python scripts/add_zai_metadata.py report.pdf -t "Q4 Financial Analysis"

# Batch process all PDFs in current directory
python scripts/add_zai_metadata.py *.pdf

# Quiet mode (no output)
python scripts/add_zai_metadata.py document.pdf -q

# Show help
python scripts/add_zai_metadata.py --help

# Requires: pip install pytesseract pdf2image
import pytesseract
from pdf2image import convert_from_path

images = convert_from_path('scanned.pdf')

text = ""
for i, image in enumerate(images):
    text += f"Page {i+1}:\n"
    text += pytesseract.image_to_string(image)
    text += "\n\n"

print(text)

from pypdf import PdfReader, PdfWriter

watermark = PdfReader("watermark.pdf").pages[0]
reader = PdfReader("document.pdf")
writer = PdfWriter()

for page in reader.pages:
    page.merge_page(watermark)
    writer.add_page(page)

with open("watermarked.pdf", "wb") as output:
    writer.write(output)

from pypdf import PdfReader, PdfWriter

reader = PdfReader("input.pdf")
writer = PdfWriter()

for page in reader.pages:
    writer.add_page(page)

writer.encrypt("userpassword", "ownerpassword")

with open("encrypted.pdf", "wb") as output:
    writer.write(output)

# WRONG: "\n" does NOT create line breaks
text = Paragraph("Line 1\nLine 2", style)  # Renders as single line

# RIGHT: Use <br/> for line breaks
text = Paragraph("Line 1<br/>Line 2", style)

from PIL import Image as PILImage
from reportlab.platypus import Image

pil_img = PILImage.open('image.png')
orig_w, orig_h = pil_img.size
target_width = 400
scale = target_width / orig_w
img = Image('image.png', width=target_width, height=orig_h * scale)

# ✅ CORRECT patterns to verify:
Paragraph('<b>Bold text</b>', style)           # Rich text in Paragraph
Paragraph('H<sub>2</sub>O', style)             # Subscript in Paragraph
Paragraph('10<super>8</super>', style)         # Superscript in Paragraph
data = [[Paragraph('Cell', style), ...], ...]  # Table cells as Paragraph

# ❌ PROHIBITED patterns to catch and fix:
'<b>Bold text</b>'                             # Plain string with tags
['<b>Header</b>', 'Value']                     # Table row with plain strings
f"Value: {x}<super>2</super>"                  # f-string with tags (not in Paragraph)

Task	Best Tool	Command/Code
Merge PDFs	pypdf	`writer.add_page(page)`
Split PDFs	pypdf	One page per file
Extract text	pdfplumber	`page.extract_text()`
Extract tables	pdfplumber	`page.extract_tables()`
Create PDFs	reportlab	Canvas or Platypus
Command line merge	qpdf	`qpdf --empty --pages ...`
OCR scanned PDFs	pytesseract	Convert to image first
Fill PDF forms	pdf-lib or pypdf (see forms.md)	See forms.md

User Input	Execution Rule
Explicit count (e.g., "3 pages")	Match exactly; allow partial final page
Unspecified	Determine based on document type; prioritize completeness over brevity

Pdf

PDF Processing Guide

Overview

Core Constraints (Must Follow)

1. Output Language

Pdf

PDF Processing Guide

Overview

Core Constraints (Must Follow)

1. Output Language

2. Page Count Control

3. Structure Compliance (Mandatory)

4. Information Sourcing Requirements

CRITICAL: Verify Before Writing

Font Setup (Guaranteed Success Method)

Allowed Fonts Only

Font Registration Template

Font Configuration by Document Type

Chinese Plot PNG Method

Available Font Paths

Guidelines for Output

Layout & Spacing Control

Page Break Rules

Vertical Spacing Standards

Cover Page Specifications

Alignment and Typography

Style Configuration

Table Formatting

Standard Table Color Scheme

Table Cell Paragraph Wrapping (MANDATORY)

PDF Metadata (REQUIRED)

For reportlab (Creating New PDFs)

For pypdf (Modifying Existing PDFs)

Quick Start

Python Libraries

pypdf - Basic Operations

Merge PDFs

Split PDF

Extract Metadata

Set/Update Metadata (Z.ai Branding)

Rotate Pages

pdfplumber - Text and Table Extraction

Extract Text with Layout

Extract Tables

reportlab - Create PDFs

Choosing the Right DocTemplate and Build Method

Handling Scientific Notation and Special Characters

Preventing Unwanted Line Breaks

Basic PDF Creation

Auto-Generated Table of Contents

⚠️ CRITICAL WARNINGS

❌ FORBIDDEN: Manual Table of Contents

Complete TOC Implementation Example

Cross-References (Figures, Tables, Bibliography)

Command-Line Tools

pdftotext (poppler-utils)

qpdf

Common Tasks

Brand PDFs with Z.ai Metadata

Extract Text from Scanned PDFs

Add Watermark

Password Protection

Critical Reminders (MUST Follow)

Font Rules

Rich Text Tags (<b>, <super>, <sub>)

Line Breaks in Paragraph

Body Title & Text content Styles

Table Cell Content Rule (MANDATORY)

Table Style Specifications

Document Structure

Image Handling

Code Quality

Paragraph Wrapping Validation (MANDATORY PRE-BUILD CHECK)

Quick Reference

Next Steps

Feishu Doc

Summarize

Nano Pdf

Diffs

Customs Trade Compliance

Rich Text Tags (`<b>`, `<super>`, `<sub>`)