Quality assurance and validation for ontology-mapped ingredients using OAK, OLS, and OWL reasoning
The Review Ingredients skill provides comprehensive quality assurance and validation for ontology-mapped ingredients in MediaIngredientMech. It systematically verifies that:
Technology Stack:
Current Dataset: 1,102 total ingredients
| Scenario | Workflow | Priority |
|---|---|---|
| Post-curation QA | Validate newly mapped ingredients before committing | High |
| Batch validation | Review all 1,034 mapped ingredients | High |
| Pre-export check | Ensure KG export quality before KG-Microbe ingestion | Critical |
| Periodic maintenance | Monthly validation after CultureMech updates | Medium |
| Synonym verification | Cross-check synonyms with ontology data | Medium |
| Chemical enrichment | Populate SMILES, InChI, formulas from OLS | Low |
| Duplicate detection | Find potential merge candidates | Low |
Decision Table:
IF newly mapped batch → Use interactive review
IF full dataset check → Use batch review
IF enrichment needed → Use auto-correct (P3/P4)
IF critical errors → Use batch review with P1 filter
IF synonym issues → Use validate_synonyms.py
Use case: Verify a specific ingredient after manual curation
# Review by preferred term
PYTHONPATH=src python scripts/review_ingredient.py "sodium chloride"
# Review by ontology ID
PYTHONPATH=src python scripts/review_ingredient.py --id CHEBI:26710
# Review with auto-correction suggestions
PYTHONPATH=src python scripts/review_ingredient.py "glucose" --suggest-fixes
Output:
Use case: Validate entire dataset, generate comprehensive report
# Review all mapped ingredients
PYTHONPATH=src python scripts/batch_review.py \
--output reports/validation_$(date +%Y%m%d) \
--format md,json,html \
--threads 4
# Filter by priority
PYTHONPATH=src python scripts/batch_review.py --priority P1,P2
# Filter by ontology source
PYTHONPATH=src python scripts/batch_review.py --source CHEBI
# Limit for testing
PYTHONPATH=src python scripts/batch_review.py --limit 10 --dry-run
Output:
validation_report.md: Human-readable summary with statisticsvalidation_data.json: Machine-readable issues + correctionsdashboard.html: Interactive sortable/filterable dashboardUse case: Auto-fix issues that don't require human review
# Dry-run to preview changes
PYTHONPATH=src python scripts/auto_correct.py --dry-run
# Apply safe corrections (P3/P4 only)
PYTHONPATH=src python scripts/auto_correct.py --apply
# Enrich chemical properties only
PYTHONPATH=src python scripts/auto_correct.py --apply --types chemical_properties
# Add missing synonyms from ontologies
PYTHONPATH=src python scripts/auto_correct.py --apply --types synonyms
Safe corrections (auto-applied):
Unsafe corrections (manual review required):
Use case: Interactive validation with Claude's assistance
# Use this skill
/review-ingredients
# Or invoke via Skill tool with specific ingredient
/review-ingredients "calcium chloride dihydrate"
Claude will:
IngredientReviewer| Level | Description | Action Required | Count Target |
|---|---|---|---|
| P1 | Critical errors blocking KG export | Fix immediately | 0 |
| P2 | High-priority warnings needing review | Manual review | < 5% |
| P3 | Medium-priority enrichment opportunities | Auto-correct when possible | < 20% |
| P4 | Low-priority info/suggestions | Optional improvements | Any |
Rule P1.1: Ontology Term Existence