Run the complete PDF processing pipeline to convert the manual PDF into Next.js application data. This command executes all steps automatically: split, render, extract, translate, build, and manifest.
Run the complete PDF processing pipeline automatically.
MANDATORY: You MUST follow the documented translation process EXACTLY as written in this skill. NO EXCEPTIONS.
mcp__codex__spawn_agents_parallel or any codex tools) for translationsubagent_type="manual-translator" as documentedThis process has been run many times successfully. Trust the documentation and execute it exactly as written.
Run with a manual slug:
/l-pdf-process <slug>
slug: Manual slug (e.g., oxi-one-mk2, oxi-coral)/l-pdf-process oxi-one-mk2
/l-pdf-process oxi-coral
This will execute all pipeline steps in order:
Phase 1: PDF Processing
lib/manual-registry.ts for Next.js buildPhase 2: Verification (AI-Powered)
pnpm build for production buildThe entire process takes approximately 20-40 minutes for a 280-page manual.
BEFORE starting the pipeline, Claude Code MUST perform these steps:
Before any processing, ask the user for manifest metadata using AskUserQuestion:
Required information:
Use AskUserQuestion tool to ask for brand name:
grep '"brand"' public/*/data/manifest.json to find existing brands):
Use AskUserQuestion tool to ask for PDF title:
Environment variable required: TAKAZUDO_MODULAR_REPO_PATH must be set in .env
Read product data from takazudomodular repo:
# Read product slugs from product-master-data.mjs
grep -o "slug: '[^']*'" ${TAKAZUDO_MODULAR_REPO_PATH}/src/data/product-master-data.mjs | \
sed "s/slug: '//g" | sed "s/'//g"
Auto-detect matching product based on manual slug:
oxi-e16-manual → oxi-e16)Ask user to confirm or select product:
Example logic:
// Read .env
const envPath = '/Users/takazudo/repos/personal/zmanuals/.env';
const envContent = fs.readFileSync(envPath, 'utf8');
const repoPath = envContent.match(/TAKAZUDO_MODULAR_REPO_PATH=(.+)/)?.[1];
// Read product-master-data.mjs
const productDataPath = `${repoPath}/src/data/product-master-data.mjs`;
const productData = fs.readFileSync(productDataPath, 'utf8');
// Extract all product slugs
const slugMatches = productData.match(/slug: '([^']+)'/g);
const productSlugs = slugMatches?.map(s => s.replace("slug: '", '').replace("'", ''));
// Find matching product for manual slug
const manualSlug = 'oxi-e16-manual'; // from command argument
const baseSlug = manualSlug.replace(/-manual|-quick-start|-guide/g, '');
const matchedProduct = productSlugs?.find(p => baseSlug.includes(p) || p.includes(baseSlug));
Note: All three questions can be asked in a single AskUserQuestion call with multiple questions.
Store all values for use in manifest creation (Step 7).
Example existing manifests:
# Check existing manifests for reference
grep -E '"(brand|title)"' public/*/data/manifest.json
# OXI Instruments - OXI ONE MKII, OXI Coral, OXI E16
# ADDAC System - ADDAC112
Then perform these validation steps:
# 1. Extract slug from command arguments
SLUG=$1
# 2. Validate slug is provided
if [ -z "$SLUG" ]; then
echo "Error: Manual slug required"
echo "Usage: /l-pdf-process <slug>"
echo ""
echo "Examples:"
echo " /l-pdf-process oxi-one-mk2"
echo " /l-pdf-process oxi-coral"
exit 1
fi
# 3. Validate slug format (only lowercase letters, numbers, and hyphens)
if ! [[ "$SLUG" =~ ^[a-z0-9-]+$ ]]; then
echo "Error: Invalid slug format: $SLUG"
echo "Slug must contain only lowercase letters, numbers, and hyphens"
echo ""
echo "Valid examples:"
echo " oxi-one-mk2"
echo " oxi-coral"
exit 1
fi
# 4. Check source directory exists
if [ ! -d "manual-pdf/$SLUG" ]; then
echo "Error: Source directory not found: manual-pdf/$SLUG"
echo ""
echo "Please create the directory and add a PDF file:"
echo " mkdir -p manual-pdf/$SLUG"
echo " cp /path/to/manual.pdf manual-pdf/$SLUG/"
exit 1
fi
# 5. Check if PDF file exists in source directory
PDF_COUNT=$(find "manual-pdf/$SLUG" -maxdepth 1 -name "*.pdf" | wc -l)
if [ "$PDF_COUNT" -eq 0 ]; then
echo "Error: No PDF file found in manual-pdf/$SLUG"
echo ""
echo "Please add a PDF file to the directory:"
echo " cp /path/to/manual.pdf manual-pdf/$SLUG/"
exit 1
fi
# 6. All validations passed - proceed with pipeline
echo "Validation successful"
echo "Processing manual: $SLUG"
echo ""
THEN run the pipeline with the slug parameter:
pnpm run pdf:all --slug "$SLUG"
Translation Quality:
\n\n) for better readabilityThe pipeline consists of the following steps. Users should not invoke these individually - they are documented here for Claude Code's internal use only.
ALWAYS run this first to ensure clean state:
pnpm run pdf:clean --slug <slug> - Remove all generated files for the specified manualThis ensures no stale data from previous runs interferes with the new processing.
These steps can be run directly using pnpm with the --slug parameter:
pnpm run pdf:split --slug <slug> - Split PDF into parts (30 pages each)pnpm run pdf:render --slug <slug> - Render pages to PNG images (150 DPI)pnpm run pdf:extract --slug <slug> - Extract text from PDFsNote: All commands now require the --slug parameter to specify which manual to process.
IMPORTANT: Translation uses Claude Code's Task tool to spawn manual-translator subagents.
DO NOT stop to ask questions during this process. Execute completely as documented.
Key Optimization: Subagents write translation files directly instead of returning full text to main agent. This significantly reduces token consumption in main agent context.
Workflow:
// Prepare all page file paths
const slug = 'oxi-coral';
const totalPages = 46;
const workers = [];
// Spawn 5 concurrent workers
const MAX_CONCURRENT = 5;
for (let i = 0; i < Math.min(MAX_CONCURRENT, totalPages); i++) {
const pageNum = i + 1;
workers.push(spawnTranslationWorker(slug, pageNum, totalPages));
}
// Continue spawning workers as they complete
let nextPage = MAX_CONCURRENT + 1;
while (workers.some(w => w)) {
for (let i = 0; i < workers.length; i++) {
if (workers[i] && checkCompleted(workers[i])) {
if (nextPage <= totalPages) {
workers[i] = spawnTranslationWorker(slug, nextPage++, totalPages);
} else {
workers[i] = null;
}
}
}
}
// Verify all files exist
const failures = verifyTranslationFiles(slug, totalPages);
// Retry failures
if (failures.length > 0) {
retryFailedPages(failures);
}
CRITICAL: Pass file paths to the subagent, NOT the page content.
<invoke name="Task">
<parameter name="subagent_type">manual-translator</parameter>
<parameter name="description">Translate page 1/46</parameter>
<parameter name="prompt">Translate page 1 of the OXI CORAL manual.
Source text file:
/Users/takazudo/repos/personal/zmanuals/public/oxi-coral/processing/extracted/page-001.txt
Output JSON file:
/Users/takazudo/repos/personal/zmanuals/public/oxi-coral/processing/translations-draft/page-001.json
Page: 1
Total pages: 46
Read the source file, translate the content, and write the JSON result directly to the output file using JSON.stringify() for proper escaping. Return only a brief status message.</parameter>
<parameter name="run_in_background">true</parameter>
</invoke>
After all workers complete, verify and retry:
function verifyTranslationFiles(slug, totalPages) {
const failures = [];
for (let i = 1; i <= totalPages; i++) {
const pageStr = String(i).padStart(3, '0');
const outputFile = `public/${slug}/processing/translations-draft/page-${pageStr}.json`;
if (!fs.existsSync(outputFile)) {
failures.push(i);
}
}
return failures;
}
function retryFailedPages(failures) {
for (const pageNum of failures) {
// Spawn retry worker
spawnTranslationWorker(slug, pageNum, totalPages);
}
}
Key Benefits:
Key Points:
run_in_background=true for all workerspnpm run pdf:build --slug <slug> - Build final JSON files from translation draftspnpm run pdf:manifest --slug <slug> - Create manifest.jsonNote: Both commands require the --slug parameter to specify which manual to process.
After manifest creation, update the manifest with brand name, title, productSlug, and updatedAt collected in Step 0:
// Read the manifest
const manifestPath = `public/${slug}/data/manifest.json`;
const manifest = JSON.parse(fs.readFileSync(manifestPath, 'utf8'));
// Update with user-provided values (collected in Step 0)
manifest.title = pdfTitle; // e.g., "OXI E16: Manual"
manifest.brand = brandName; // e.g., "OXI Instruments"
manifest.productSlug = productSlug; // e.g., "oxi-e16" (from takazudomodular product data)
// Add updatedAt with current date in YYYYMMDD format
const today = new Date();
const year = today.getFullYear();
const month = String(today.getMonth() + 1).padStart(2, '0');
const day = String(today.getDate()).padStart(2, '0');
manifest.updatedAt = `${year}${month}${day}`; // e.g., "20260112"
// Write back
fs.writeFileSync(manifestPath, JSON.stringify(manifest, null, 2));
Or use the Edit tool to update all fields:
{
"title": "OXI E16: Manual", // Update this with user-provided title
"brand": "OXI Instruments", // Add this with user-provided brand
"productSlug": "oxi-e16", // Add product slug from takazudomodular
"updatedAt": "20260112", // Add current date in YYYYMMDD format
"version": "1.0.0",
...
}
This step is REQUIRED to ensure:
After manifest update, add the new manual to lib/manual-registry.ts so Next.js can generate pages for it.
This step is CRITICAL - without it, the build will not generate HTML pages for the new manual.
Convert the slug to a camelCase variable name:
// Example: "ai008-matrix-mixer" → "ai008MatrixMixer"
function slugToVarName(slug) {
return slug.replace(/-([a-z0-9])/g, (_, char) => char.toUpperCase());
}
Read lib/manual-registry.ts and check if the manual is already imported:
const registryPath = 'lib/manual-registry.ts';
const content = fs.readFileSync(registryPath, 'utf8');
const isAlreadyRegistered = content.includes(`'${slug}':`);
If already registered, skip this step.
Find the last import block and add new imports after it:
// Import {slug}
import {varName}Manifest from '@/public/{slug}/data/manifest.json';
import {varName}Pages from '@/public/{slug}/data/pages-ja.json';
Example:
// Import ai008-matrix-mixer
import ai008MatrixMixerManifest from '@/public/ai008-matrix-mixer/data/manifest.json';
import ai008MatrixMixerPages from '@/public/ai008-matrix-mixer/data/pages-ja.json';
Use the Edit tool to insert after the last existing import (before export interface ManualRegistryEntry).
Find the closing }; of the MANUAL_REGISTRY object and add new entry before it:
'{slug}': {
manifest: {varName}Manifest as unknown as ManualManifest,
pages: {varName}Pages as unknown as ManualPagesData,
},
Example:
'ai008-matrix-mixer': {
manifest: ai008MatrixMixerManifest as unknown as ManualManifest,
pages: ai008MatrixMixerPages as unknown as ManualPagesData,
},
Use the Edit tool to insert before the closing }; of MANUAL_REGISTRY.
const slug = 'ai008-matrix-mixer';
const varName = 'ai008MatrixMixer'; // converted from slug
// 1. Add imports (find last import, add after it)
const importBlock = `
// Import ${slug}
import ${varName}Manifest from '@/public/${slug}/data/manifest.json';
import ${varName}Pages from '@/public/${slug}/data/pages-ja.json';
`;
// 2. Add registry entry
const registryEntry = ` '${slug}': {
manifest: ${varName}Manifest as unknown as ManualManifest,
pages: ${varName}Pages as unknown as ManualPagesData,
},`;
// Use Edit tool to:
// - Insert importBlock before "export interface ManualRegistryEntry"
// - Insert registryEntry before the closing "};" of MANUAL_REGISTRY
Why this step is REQUIRED:
generateStaticParams() won't include this manualAfter all translation and build steps are complete, execute the verification phase directly (do NOT call /l-verify-translation as a separate skill).
pnpm build
This creates an optimized production build in /out/ directory.
# Start serve in background
pnpm serve &
# Wait for server to be ready
sleep 3
# Verify server is running (port 8030)
curl -s -o /dev/null -w "%{http_code}" http://localhost:8030/manuals/$SLUG/page/1
Use the lightweight capture script (NOT MCP Playwright):
node .claude/skills/verify-translation/scripts/capture-pages.js \
--slug $SLUG \
--pages $TOTAL_PAGES \
--port 8030
This script:
__inbox/verify-{slug}-{date}-{session}/For EACH captured page, perform visual verification:
Issues to detect:
| Issue | Description |
|---|---|
| Missing header | PDF shows section header but translation starts mid-content |
| Missing paragraphs | PDF has more paragraphs than translation shows |
| Content order wrong | Translation starts from middle of page |
| Extraction failure | Large portions of PDF text not in translation |
Record findings for each page:
{
"pageNum": 49,
"status": "needs_fix",
"issues": ["Missing header: 'Scenes 3'", "Missing paragraph"]
}
For each page flagged as needing fix:
13.1 Regenerate extracted text from PDF image:
Look at the PDF image (left side of screenshot) and extract ALL visible English text in correct reading order.
13.2 Update the extracted text file:
Write to: public/$SLUG/processing/extracted/page-XXX.txt
13.3 Re-translate the page:
<invoke name="Task">
<parameter name="subagent_type">manual-translator</parameter>
<parameter name="description">Re-translate page XXX</parameter>
<parameter name="prompt">Translate page XXX of the manual.
Source: /path/to/extracted/page-XXX.txt
Output: /path/to/translations-draft/page-XXX.json
Page: XXX, Total: YYY</parameter>
</invoke>
If any pages were fixed:
# Copy translations to expected location
mkdir -p public/manuals/$SLUG/processing/translations-draft
cp public/$SLUG/processing/translations-draft/*.json public/manuals/$SLUG/processing/translations-draft/
# Rebuild pages.json
pnpm run pdf:build --slug $SLUG
# Copy back to correct location
cp public/manuals/$SLUG/data/pages.json public/$SLUG/data/pages.json
rm -rf public/manuals/
# Format
pnpm format:fix
lsof -ti:8030 | xargs kill -9 2>/dev/null || true
Output a verification report:
## Translation Verification Report
**Manual:** {slug}
**Total Pages:** {totalPages}
**Date:** {date}
### Verification Results
| Status | Count |
|--------|-------|
| Passed | XX |
| Fixed | XX |
### Pages Fixed
| Page | Issues Found | Fix Applied |
|------|--------------|-------------|
| 35 | Missing header | Regenerated, re-translated |
| 49 | Missing paragraph | Regenerated, re-translated |
### Verification Complete
All pages now match their PDF images.
Manual is ready for deployment.
Why verification is mandatory:
PDF text extraction (pdf-parse) can fail silently. The manual-translator subagent only sees extracted text, not images, so it cannot detect missing content. This verification step catches those failures.
/l-pdf-process <slug>
This runs everything automatically including verification.
pnpm run pdf:clean --slug <slug> # Clean existing files
pnpm run pdf:split --slug <slug> # Split PDF
pnpm run pdf:render --slug <slug> # Render pages
pnpm run pdf:extract --slug <slug> # Extract text
# Translation via Task tool (manual-translator subagents)
pnpm run pdf:build --slug <slug> # Build JSON
pnpm run pdf:manifest --slug <slug> # Create manifest
pnpm build
pnpm serve &
node .claude/skills/verify-translation/scripts/capture-pages.js --slug <slug> --pages <total>
# Then manually verify captured screenshots
manual-pdf/{slug}/ directorymanual-pdf/{slug}/ # Source PDF directory
└── *.pdf # Source PDF file
public/{slug}/ # Output directory
├── data/ # Final JSON files (committed)
│ ├── manifest.json
│ └── pages.json
├── pages/ # Rendered PNG images (300 DPI)
│ ├── page-001.png
│ └── ... (page-XXX.png)
└── processing/ # Intermediate files (gitignored)
├── extracted/ # Extracted text
└── translations-draft/ # Translation drafts
Edit pdf-config.json to customize:
__inbox/Estimated time (280-page manual):
Estimated cost: