Classify dual-nature entities (hotels, churches, schools of arts, halls, lodges) in historical newspaper text as building-only, business/organisation-only, or both (polyhierarchical) based on contextual linguistic analysis. Use when analysing entity mentions from the Blue Mountains Historical Society Zotero library to determine appropriate Getty AAT facet assignments (Built Environment vs Agents).
Classify mentions of dual-nature entities in 19th-century Australian newspaper text to determine whether each entity should be tagged as a physical building/facility (Built Environment facet), a business/organisation (Agents facet), or both (polyhierarchical).
Uses natural language understanding to apply a linguistic heuristic framework, analysing spatial indicators (locational prepositions, events occurring, physical features) versus agency indicators (ownership, business operations, services provided).
Invoke this skill when:
Accept formatted input containing:
Input may come from:
scripts/38_classify_entities_with_claude.py) extracts mentions from Zotero and generates formatted promptsFor each mention, analyse the context using the decision framework documented in references/classification_heuristic.md:
Building/Facility indicators:
Business/Organisation indicators:
Both (Polyhierarchical) indicators:
Context Genre Recognition:
Metonymy Handling:
Based on indicator analysis, assign classification:
Assess confidence level:
Default Guidance When Indicators Are Weak:
For each entity mention, return structured classification:
### Entity: [Entity Name]
**Item:** [Article Title] ([Date if available])
**Classification:** building | business | organisation | both
**Confidence:** high | medium | low
**Reasoning:**
[2-3 sentences explaining classification, referencing specific textual evidence and matched indicators]
**Indicators Found:**
- Building: [list matched indicators, or "none"]
- Business/Organisation: [list matched indicators, or "none"]
**Context:**
> [The relevant excerpt with entity mention and surrounding text]
Adhere to these standards when classifying:
Complete decision framework with detailed indicator definitions, edge case guidance, and extended examples showing building-only, business-only, both, and metonymy cases.
Side-by-side comparison of Claude NLU approach versus regex-based classification, demonstrating where natural language understanding provides superior results (context genre recognition, metonymy handling, confidence calibration).
Python script to automate collection of entity mentions from Zotero library. Fetches items by tag, extracts full text from notes, finds context around mentions, and generates formatted prompts for classification.
This skill supports the Blue Mountains folksonomy rationalisation project:
data/tag_map_consolidated.csvThis skill applies to any dual-nature entity. Adjust defaults as needed:
When processing new entity types, update default guidance in references and document any entity-specific patterns encountered.