Use when initializing a translation project from extraction through glossary, chapter mapping, and progress tracking.
Initialize translation baseline from PDF extraction to chaptered docs, style decisions, glossary, and progress tracker.
Core principle: Build a deterministic, verifiable baseline before any large-scale translation.
Before ANY action, create tasks using TaskCreate:
Run cleanup:
uv run python scripts/clean_sample_data.py --yes
Then resolve source PDF from $ARGUMENTS or ask user in Traditional Chinese. Ensure source is under data/pdfs/.
Before extraction, ask user in Traditional Chinese whether to preserve PDF images in the generated docs:
是否要保留 PDF 內的圖片,並在切分後的 Markdown 文件中保留對應圖片連結?
Record this decision as preserve_images: true/false for the rest of the run.
Verification: Source PDF exists under data/pdfs/; preserve_images decision recorded; cleanup script exited 0.
Create TaskCreate items for:
Verification: All tasks created with correct descriptions; task list matches the phases above.
Run:
uv run python scripts/extract_pdf.py <pdf_path> --include-images
If preserve_images is false, run instead:
uv run python scripts/extract_pdf.py <pdf_path> --no-include-images
Validate outputs:
data/markdown/<name>.mddata/markdown/<name>_pages.mddata/markdown/images/<name>/(only when preserve_images = true)Verification: All expected output files exist and are non-empty.
Review readability and completeness. If needed, split large source into parts and re-extract until clean.
Verification: Extracted markdown is readable; no garbled text or truncated sections remain.
Summarize content to user in Traditional Chinese:
書本內容概覽:
- 主要內容類型:[規則說明、範例場景、角色選項...]
- 特殊結構:[大量表格、骰表、設計者備註...]
- 建議可使用的格式化元件:[...]
Collect formatting choices (Traditional Chinese):
auto / single-column / double-column)note/tip/caution/danger)Persist via script:
uv run python scripts/style_decisions.py init
uv run python scripts/style_decisions.py set-document-format \
--layout-profile "<auto|single-column|double-column>" \
--aside-note "<note_component>" \
--aside-tip "<tip_component>" \
--aside-caution "<caution_component>" \
--aside-danger "<danger_component>" \
--cards-usage "<cards_usage_note>" \
--tabs-usage "<tabs_usage_note>" \
--tables-convention "<table_note>" \
--dice-tables-convention "<dice_table_note>"
uv run python scripts/validate_style_decisions.py
Verification: style-decisions.json contains document_format section; validate_style_decisions.py exits 0.
preserve_images = true, ask user to assign extracted images for hero/background/og.preserve_images = true, copy and resize where needed.preserve_images = false, skip extracted image assignment and continue with theme-only setup.docs/src/styles/custom.css and persist style decisions.uv run python scripts/style_decisions.py set-images --preserve-images <true_or_false>
Ask for site meta in Traditional Chinese (all four fields):
site.title):首頁 <title> 及 frontmatter title,例:「Rapscallion 遊戲規則」site.description):SEO description,一句話site.tagline):hero 區塊顯示的一行短語site.intro):首頁「內容簡介」段落,一到兩句
Persist via:uv run python scripts/style_decisions.py set-site \
--title "<USER_INPUT>" \
--description "<USER_INPUT>" \
--tagline "<USER_INPUT>" \
--intro "<USER_INPUT>"
Ask for copyright and credits in Traditional Chinese:
© 2024 Author Name. All rights reserved.)Persist via:
uv run python scripts/style_decisions.py set-copyright \
--text "<USER_INPUT>" \
--show-on-homepage <true_or_false>
uv run python scripts/style_decisions.py set-credits \
--entry "原作者:..." \
--entry "翻譯:..." \
--show-on-homepage <true_or_false>
uv run python scripts/style_decisions.py add-translation-note \
--key "<short_key>" \
--topic "<optional_topic>" \
--note "<USER_INPUT>"
If the note is specific to one source file or future appended document, use:
uv run python scripts/style_decisions.py add-translation-note \
--document-key "<pdf_stem_or_doc_id>" \
--key "<short_key>" \
--note "<USER_INPUT>"
uv run python scripts/validate_style_decisions.py
generate_nav.py will render these as ## 版權宣告 and ## 製作名單 sections on the homepage. If neither is provided, a generic fallback disclaimer is used.
Verification: validate_style_decisions.py exits 0; style-decisions.json contains site meta, copyright, credits, and image decisions.
Invoke term-decision skill for terminology bootstrap instead of duplicating the workflow here.
Required handoff to term-decision:
glossary.jsonstyle-decisions.jsonuv run python scripts/term_generate.py --min-frequency 2
uv run python scripts/term_cal_batch.py
uv run python scripts/validate_glossary.py
uv run python scripts/term_read.py --fail-on-missing --fail-on-forbidden
init-doc must not continue to chapter split until the term-decision handoff completes cleanly.
Verification: validate_glossary.py and term_read.py --fail-on-missing --fail-on-forbidden both exit 0.
Invoke chapter-split skill instead of duplicating split logic here.
Required handoff to chapter-split:
_pages.md file from this init run.preserve_images decision from Step 1.chapters.json, split docs output, and regenerate navigation.chapter-split reports unresolved critical issues, stop init-doc and resolve them before continuing.Verification: chapters.json exists; split docs generated; nav regenerated.
Create data/translation-progress.json from chapters.json:
uv run python scripts/init_create_progress.py --force
Tracker contract:
not_started_meta fields (updated, total_chapters, completed)Verification: data/translation-progress.json exists; contains all chapters from chapters.json with status not_started; _meta fields present.
Run one-shot handoff gate:
uv run python scripts/init_handoff_gate.py
If any gate fails, stop and fix before completion.
Verification: init_handoff_gate.py exits 0; all tasks marked completed.
digraph init_doc {
rankdir=TB;
cleanup [label="Cleanup &\nsource validation", shape=box];
tasks [label="Create tasks", shape=box];
extract [label="Extract PDF", shape=box];
img_decision [label="Preserve\nimages?", shape=diamond];
extract_img [label="Extract with\n--include-images", shape=box];
extract_noimg [label="Extract with\n--no-include-images", shape=box];
crop [label="Cropping review\n& optional split", shape=box];
format [label="Formatting\ndecisions", shape=box];
theme [label="Images, theme\n& homepage", shape=box];
term [label="Terminology\nbaseline\n(term-decision)", shape=box];
term_ok [label="Term validation\npasses?", shape=diamond];
split [label="Chapter split\n(chapter-split)", shape=box];
split_ok [label="Split\nsucceeded?", shape=diamond];
progress [label="Create progress\ntracker", shape=box];
gate [label="Final handoff\ngate", shape=box];
gate_ok [label="Gate\npasses?", shape=diamond];
done [label="Done →\ntranslate", shape=box];
fix [label="Fix & retry", shape=box];
cleanup -> tasks -> extract;
extract -> img_decision;
img_decision -> extract_img [label="yes"];
img_decision -> extract_noimg [label="no"];
extract_img -> crop;
extract_noimg -> crop;
crop -> format -> theme -> term;
term -> term_ok;
term_ok -> split [label="yes"];
term_ok -> fix [label="no"];
fix -> term;
split -> split_ok;
split_ok -> progress [label="yes"];
split_ok -> fix [label="no"];
progress -> gate;
gate -> gate_ok;
gate_ok -> done [label="yes"];
gate_ok -> fix [label="no"];
}
Stop when:
chapter-split cannot produce a usable configReturn to earlier steps when:
| Thought | Reality |
|---|---|
| "Validation failed but it's probably fine, keep going" | Fail-closed means stop. Fix the failure before proceeding. |
| "Skip chapter-split handoff and split manually" | Chapter-split skill ensures deterministic structure. Never bypass. |
| "User didn't answer about proper nouns, I'll just pick one" | User confirmation is required for formatting and proper noun policy. |
| "Progress tracker can be created later" | Tracker must be initialized before handoff. No exceptions. |
| "Skip terminology baseline, we can add terms during translation" | Terminology drift across chapters is costly. Bootstrap first. |
| "I'll reuse the old glossary without re-validating" | Glossary changes between runs. Always validate. |
| "One quick formatting change doesn't need style-decisions.json" | All formatting decisions must be persisted. No ad-hoc overrides. |
Continue with /translate or /super-translate.
/init-doc
/init-doc data/pdfs/rulebook.pdf