Markdown-to-PDF pipeline via Pandoc and LuaLaTeX with emoji rendering, dual output, and print-ready formatting
End-to-end book production pipeline: Markdown → Pandoc → LuaLaTeX → dual PDF (print + digital).
Scope: Inheritable skill. Covers the complete pipeline for producing professional-quality PDF books from Markdown source, including emoji handling, dual format output, and print-ready configuration.
Markdown Source
↓
Pandoc (--pdf-engine=lualatex)
↓
LuaLaTeX Engine
↓
├── Print PDF (twoside, crop marks, ISBN)
└── Digital PDF (oneside, hyperlinks, bookmarks)
| Tool | Version | Purpose |
|---|---|---|
| Pandoc | 3.x+ | Markdown → LaTeX conversion |
| LuaLaTeX | TeX Live 2024+ | PDF rendering (Unicode-native) |
| Twemoji | Latest | Cross-platform emoji rendering |
needspace package | LaTeX | Orphan/widow prevention |
Why LuaLaTeX: Native Unicode support (XeLaTeX works but LuaLaTeX handles emoji processing more reliably with Lua filters).
Zero Width Joiner (ZWJ) sequences combine multiple emoji into one glyph. Sort order is critical:
| Emoji | Codepoints | Length |
|---|---|---|
| 👨👩👧👦 | U+1F468 U+200D U+1F469 U+200D U+1F467 U+200D U+1F466 | 7 |
| 👨👩👧 | U+1F468 U+200D U+1F469 U+200D U+1F467 | 5 |
| 👨👩 | U+1F468 U+200D U+1F469 | 3 |
| 👨 | U+1F468 | 1 |
CRITICAL RULE: The emoji replacement map MUST be sorted by length descending (longest sequences first). If you process 👨 before 👨👩👧👦, the family emoji gets partially replaced and corrupts the output.
Create an explicit emoji-map.json that controls all replacements:
{
"metadata": {
"version": "1.0",
"source": "Twemoji",
"sortOrder": "length-descending"
},
"emojis": [
{
"sequence": "👨👩👧👦",
"codepoints": "1f468-200d-1f469-200d-1f467-200d-1f466",
"image": "1f468-200d-1f469-200d-1f467-200d-1f466.png",
"length": 7
}
]
}
Rule: Never rely on automatic emoji detection. Use an explicit map file that you control and sort.
Embed Twemoji images directly as base64 in the LaTeX output to avoid external file dependencies:
-- Pandoc Lua filter for emoji replacement
function Str(elem)
-- Process emoji map (length-descending order)
for _, entry in ipairs(emoji_map) do
if elem.text:find(entry.sequence) then
local img = pandoc.Image("", entry.base64_data_uri)
-- Set size to match surrounding text
img.attributes.height = "1em"
return img
end
end
end
Windows cannot natively render flag emoji (🇺🇸, 🇬🇧, etc.) in many contexts. Solutions:
| Approach | Result |
|---|---|
| Twemoji replacement in PDF | Full flag rendering |
| HTML output with Twemoji CSS | Full flag rendering |
| Windows terminal/editor | Broken or missing flags |
Rule: Always preview emoji-heavy content in the PDF output, not in the editor.
# pandoc-print.yaml
pdf-engine: lualatex