Guidelines for adding and maintaining cross-references between dictionary entries. Covers reference types, format requirements, and extraction from notes.
When creating or revising entries, add cross-references to related vocabulary. This improves navigation and helps learners understand word relationships.
pair - Transitivity Pairs (HIGH PRIORITY)Use for verb transitivity pairs ({自動詞|じどうし}/{他動詞|たどうし}).
{
"type": "pair",
"reading": "しまる",
"headword": "{閉|し}まる",
"label": "intransitive"
}
Labels: intransitive or transitive
Common pairs:
antonym - Opposites (HIGH PRIORITY)Use for direct opposites.
{
"type": "antonym",
"reading": "あける",
"headword": "{開|あ}ける",
"label": "to open"
}
Label: Brief gloss of target word
keigo - Honorific/Humble Forms (HIGH PRIORITY)Use for formal speech equivalents.
{
"type": "keigo",
"reading": "めしあがる",
"headword": "{召|め}し{上|あ}がる",
"label": "honorific"
}
Labels: honorific or humble
Common keigo links:
synonym - Similar Meaning (MEDIUM PRIORITY)Use for words with similar meaning but different nuance.
{
"type": "synonym",
"reading": "りかいする",
"headword": "{理解|りかい}する",
"label": "formal"
}
Label: Distinguishing characteristic (e.g., "formal", "written", "casual")
contrast - Easily Confused (MEDIUM PRIORITY)Use for words learners often confuse.
{
"type": "contrast",
"reading": "が",
"headword": "が",
"label": "subject marking"
}
Especially important for:
related - Semantically Connected (LOW PRIORITY)Use for derived words, compounds, or thematically related vocabulary.
{
"type": "related",
"reading": "たべもの",
"headword": "{食|た}べ{物|もの}",
"label": "food (noun)"
}
see_also - General Reference (LOW PRIORITY)Use for general cross-references that don't fit other categories.
{
"type": "see_also",
"reading": "しょくじ",
"headword": "{食事|しょくじ}",
"label": null
}
Each cross-reference object requires:
| Field | Required | Description |
|---|---|---|
type | Yes | One of: pair, synonym, antonym, keigo, related, see_also, contrast, homophone |
target_id | No | Hard-coded entry ID for direct resolution (takes priority over reading/headword) |
reading | Yes | Hiragana reading (fallback lookup key when no target_id) |
headword | Yes* | Display form with furigana (required for homonym disambiguation) |
label | No | Short descriptor |
*Headword is required for proper resolution. Without it, cross-references cannot be disambiguated between homonyms.
Note: Valid cross-reference types are defined centrally in build/constants.py and shared across the schema, validation, and build scripts.
The dictionary uses a hybrid system that supports both:
target_id - Direct reference to an entry ID (unambiguous)When resolving a cross-reference:
target_id present AND entry exists → resolved (use ID directly)target_id present AND entry missing → ERROR (stale reference)target_id → resolve by reading/headword (may be pending if target doesn't exist)target_idUse target_id when:
Don't manually add target_id when:
Instead, use the harden_references.py script to automatically add target_id to resolvable references.
{
"type": "pair",
"target_id": "00754_shimaru",
"reading": "しまる",
"headword": "{閉|し}まる",
"label": "intransitive"
}
{
"type": "antonym",
"reading": "ひらく",
"headword": "{開|ひら}く",
"label": "to open"
}
CRITICAL: Many Japanese words share the same reading but have different kanji (homonyms). The headword field is essential for correct resolution.
Example: The reading かんじょう has multiple entries:
If you reference かんじょう without specifying the headword, the system cannot determine which entry you mean.
Always include the headword to ensure cross-references link to the correct entry.
// CORRECT - specifies headword for disambiguation
{
"type": "synonym",
"reading": "かんじょう",
"headword": "{勘定|かんじょう}",
"label": "bill, calculation"
}
// INCORRECT - no headword, may link to wrong homonym
{
"type": "synonym",
"reading": "かんじょう",
"label": "bill, calculation"
}
Validation detects homonym mismatches: When you specify a headword that doesn't match any existing entry with that reading (e.g., 勘定 when only 感情 exists), the validator will warn you. This indicates either:
When adding references to entries, prioritize:
HIGH - Always add if applicable:
MEDIUM - Add when natural:
LOW - Add sparingly:
The notes field often contains vocabulary that should be cross-referenced. Look for:
Pair verbs:
Antonyms:
Keigo:
Related words:
Run the extraction script to find potential references:
# Dry run - see proposed changes
python3 build/extract_references.py
# Apply changes
python3 build/extract_references.py --apply
# Single entry
python3 build/extract_references.py --id 00396_taberu
Note: The extraction script now performs immediate resolution. When a target entry exists, the extracted reference automatically includes target_id.
The harden_references.py script scans entries and adds target_id to resolvable cross-references. This "hardens" forward references into direct ID-based references once the target entry exists.
# Dry run - see what would change
python3 build/harden_references.py
# Apply changes
python3 build/harden_references.py --apply
# Single entry
python3 build/harden_references.py --id 00485_shimeru
When to run:
target_idThe script will:
target_id to unambiguously resolvable referencestarget_id references (target no longer exists)Important: You can add references to entries that don't exist yet.
reading as the primary key (required)headword for display purposesThis allows you to:
After adding references, validate:
python3 build/validate.py --id {entry_id}
The validator checks:
target_id points to a non-existent entrytarget_id)| Type | Meaning | Action |
|---|---|---|
| ERROR: Stale target_id | target_id points to deleted entry | Remove or update target_id |
| WARNING: Hardenable | Reference resolvable but missing target_id | Run harden_references.py --apply |
| WARNING: Homonym mismatch | Headword doesn't match any entry with that reading | Verify correct homonym or wait for entry creation |
Before:
{
"id": "00485_shimeru",
"cross_references": []
}
After:
{
"id": "00485_shimeru",
"cross_references": [
{
"type": "pair",
"reading": "しまる",
"headword": "{閉|し}まる",
"label": "intransitive"
},
{
"type": "antonym",
"reading": "あける",
"headword": "{開|あ}ける",
"label": "to open"
}
]
}