Guidelines for adding and maintaining cross-references between dictionary entries. Covers reference types, format requirements, and extraction from notes.
When creating or revising entries, add cross-references to related vocabulary. This improves navigation and helps learners understand word relationships.
The dictionary has two structured cross-reference systems (plus inline word links, which are handled separately):
prominent_see_also — Top-of-entry links (HIGH VISIBILITY)Displayed immediately below the headword, before definitions. These are the first thing a learner sees after the headword. Use for word pairs and groups that are closely related and which learners are likely to want to navigate between.
When to use prominent_see_also:
prominent_see_also, NOT cross_references
When NOT to use prominent_see_also:
cross_references with type synonym)Format:
"prominent_see_also": [
{
"target_id": "00754_shimaru",
"reading": "しまる",
"headword": "{閉|し}まる",
"note": "intransitive"
}
]
target_id when the target entry existsnote in English (2-4 words) explaining the relationshipcross_references — "Related Words" box at the bottom (STRUCTURED)A structured array displayed in a "Related Words" box at the bottom of the entry page. These express lexicographic relationships between entries. Two-way linking is encouraged but not as critical as for prominent_see_also.
cross_references)antonym — Opposites (HIGH PRIORITY)Use for direct opposites.
{
"type": "antonym",
"reading": "あける",
"headword": "{開|あ}ける",
"label": "to open"
}
Label: Brief gloss of target word
keigo — Honorific/Humble Forms (HIGH PRIORITY)Use for formal speech equivalents.
{
"type": "keigo",
"reading": "めしあがる",
"headword": "{召|め}し{上|あ}がる",
"label": "honorific"
}
Labels: honorific or humble
Common keigo links:
synonym — Similar Meaning (MEDIUM PRIORITY)Use for words with similar meaning but different nuance.
{
"type": "synonym",
"reading": "りかいする",
"headword": "{理解|りかい}する",
"label": "formal"
}
Label: Distinguishing characteristic (e.g., "formal", "written", "casual")
contrast — Easily Confused (MEDIUM PRIORITY)Use for words learners often confuse.
{
"type": "contrast",
"reading": "が",
"headword": "が",
"label": "subject marking"
}
Especially important for:
homophone — Same Reading, Different Meaning (MEDIUM PRIORITY)Use for words that share a reading. Note: if the homophones are easily confused, prefer prominent_see_also instead.
related — Semantically Connected (LOW PRIORITY)Use for derived words, compounds, or thematically related vocabulary.
{
"type": "related",
"reading": "たべもの",
"headword": "{食|た}べ{物|もの}",
"label": "food (noun)"
}
see_also — General Reference (LOW PRIORITY)Use for general cross-references that don't fit other categories.
{
"type": "see_also",
"reading": "しょくじ",
"headword": "{食事|しょくじ}",
"label": null
}
pair — DEPRECATEDDo not use. Transitive/intransitive verb pairs should use prominent_see_also instead. Existing pair-type entries in cross_references should be migrated to prominent_see_also when entries are revised. The type remains technically valid in the schema but should not appear in new entries.
Each cross_references entry requires:
| Field | Required | Description |
|---|---|---|
type | Yes | One of: synonym, antonym, keigo, related, see_also, contrast, homophone (avoid pair) |
target_id | No | Hard-coded entry ID for direct resolution (takes priority over reading/headword) |
reading | Yes | Hiragana reading (fallback lookup key when no target_id) |
headword | Yes* | Display form with furigana (required for homonym disambiguation) |
label | No | Short descriptor |
*Headword is required for proper resolution. Without it, cross-references cannot be disambiguated between homonyms.
Each prominent_see_also entry requires:
| Field | Required | Description |
|---|---|---|
target_id | No | Hard-coded entry ID for direct resolution |
reading | Yes | Hiragana reading |
headword | Yes | Display form with furigana |
note | Yes | Brief English description of the relationship (2-4 words) |
Note: Valid cross-reference types are defined centrally in build/constants.py and shared across the schema, validation, and build scripts.
The dictionary uses a hybrid system that supports both:
target_id — Direct reference to an entry ID (unambiguous)When resolving a cross-reference:
target_id present AND entry exists → resolved (use ID directly)target_id present AND entry missing → ERROR (stale reference)target_id → resolve by reading/headword (may be pending if target doesn't exist)target_idUse target_id when:
Don't manually add target_id when:
Instead, use the harden_references.py script to automatically add target_id to resolvable references.
{
"type": "antonym",
"target_id": "00754_shimaru",
"reading": "しまる",
"headword": "{閉|し}まる",
"label": "intransitive"
}
{
"type": "antonym",
"reading": "ひらく",
"headword": "{開|ひら}く",
"label": "to open"
}
CRITICAL: Many Japanese words share the same reading but have different kanji (homonyms). The headword field is essential for correct resolution.
Example: The reading かんじょう has multiple entries:
Always include the headword to ensure cross-references link to the correct entry.
// CORRECT — specifies headword for disambiguation
{
"type": "synonym",
"reading": "かんじょう",
"headword": "{勘定|かんじょう}",
"label": "bill, calculation"
}
// INCORRECT — no headword, may link to wrong homonym
{
"type": "synonym",
"reading": "かんじょう",
"label": "bill, calculation"
}
When creating new entries (e.g., via prompts/newentries.md), add cross-references as part of entry creation:
Always add prominent_see_also for:
Add cross_references for:
Check if the target entry exists using check_duplicate.py or the entries index:
target_id and add a back-link on the target entryFor back-links on existing entries: When you add a cross-reference pointing to an existing entry, also add a reciprocal reference on that target entry pointing back to the new entry.
When adding references to entries, prioritize:
HIGH — Always add if applicable:
prominent_see_alsoprominent_see_alsoprominent_see_alsocross_references (keigo)cross_references (antonym)MEDIUM — Add when natural:
LOW — Add sparingly:
The notes field often contains vocabulary that should be cross-referenced. Look for:
Pair verbs:
Antonyms:
Keigo:
Related words:
Run the extraction script to find potential references:
# Dry run — see proposed changes
python3 build/extract_references.py
# Apply changes
python3 build/extract_references.py --apply
# Single entry
python3 build/extract_references.py --id 00396_taberu
Note: The extraction script now performs immediate resolution. When a target entry exists, the extracted reference automatically includes target_id.
The harden_references.py script scans entries and adds target_id to resolvable cross-references. This "hardens" forward references into direct ID-based references once the target entry exists.
# Dry run — see what would change
python3 build/harden_references.py
# Apply changes
python3 build/harden_references.py --apply
# Single entry
python3 build/harden_references.py --id 00485_shimeru
When to run:
target_idImportant: You can add references to entries that don't exist yet.
reading as the primary key (required)headword for display purposesThis allows you to:
After adding references, validate:
python3 build/validate.py --id {entry_id}
The validator checks:
target_id points to a non-existent entrytarget_id)| Type | Meaning | Action |
|---|---|---|
| ERROR: Stale target_id | target_id points to deleted entry | Remove or update target_id |
| WARNING: Hardenable | Reference resolvable but missing target_id | Run harden_references.py --apply |
| WARNING: Homonym mismatch | Headword doesn't match any entry with that reading | Verify correct homonym or wait for entry creation |
prominent_see_also (for verbs)prominent_see_also (if applicable)prominent_see_also (if easily confused)pair type in cross_references (use prominent_see_also instead)Cross-references should be bidirectional for most relationship types. The table below summarizes when back-links are required vs. optional:
| Relationship | Back-link Required? | Via |
|---|---|---|
| Transitive/intransitive pair | Always | prominent_see_also both ways |
| N/Nする pair | Always | prominent_see_also both ways |
| Homophones (confusable) | Always | prominent_see_also both ways |
| Antonym | Usually | cross_references (antonym) both ways |
| Keigo | Usually | cross_references (keigo) both ways within group |
| Synonym | Case-by-case | cross_references (synonym) — add back-link if genuinely helpful |
| Contrast | Case-by-case | cross_references (contrast) |
| Related | Optional | cross_references (related) |
| See also | Optional | cross_references (see_also) |
Use the asymmetry report to find one-way references:
python3 build/find_merge_candidates.py --asymmetry-only
Use the cluster linter to find incomplete semantic groups:
python3 build/check_semantic_clusters.py
When fixing symmetry issues, process related entries together as a cluster rather than one at a time. This ensures both sides of a relationship are updated in the same session. See the "Cluster Mode" section in prompts/add_cross-references.md for the detailed workflow.