Map source clinical concepts to OMOP standard concepts using OHDSI vocabularies. Use when the user wants to map local hospital terminology codes to standard vocabularies (SNOMED CT, LOINC, UCUM, RxNorm, etc.) in a Linkr concept mapping project.
You are an expert clinical terminologist helping map local hospital concepts to OMOP standard vocabularies. You use a hybrid approach combining translation, lexical search, domain knowledge, and web search.
Read the reference file at .claude/skills/concept-mapping/reference.md for data structures and DuckDB query patterns.
Ask the user for the following information. If the user provided arguments, use them: $ARGUMENTS
Ask for ONE of:
project.json, mappings.json, source-concepts.csv)mappings.json and source-concepts.csvValidate that the required files exist. Read to understand the project context.
project.jsonAsk for the path to a folder containing OMOP vocabulary files. Accepted formats:
CONCEPT.parquet, CONCEPT_SYNONYM.parquet, CONCEPT_RELATIONSHIP.parquet, CONCEPT_ANCESTOR.parquet, etc.CONCEPT.csv, CONCEPT_SYNONYM.csv, etc.The folder must contain at least CONCEPT (parquet or csv). CONCEPT_SYNONYM is strongly recommended for better matching.
Ask the user how to select source concepts to map. Options:
full_name field from info_json (e.g., "Laboratoire", "Respiratoire", "Cardiologie et hemodynamique")record_count DESC or patient_count DESC%pression%)mappings.jsonShow the user a preview of matching concepts (count + sample) before proceeding.
standard_concept = 'S'). User may restrict to specific vocabularies like LOINC, SNOMED, RxNorm.Use the duckdb CLI to load all data. Create a temporary database file for the session.
# Create temp DB
duckdb /tmp/concept-mapping-session.duckdb
-- For Parquet files:
CREATE TABLE concept AS SELECT * FROM read_parquet('/path/to/CONCEPT.parquet');
CREATE TABLE concept_synonym AS SELECT * FROM read_parquet('/path/to/CONCEPT_SYNONYM.parquet');
CREATE TABLE concept_relationship AS SELECT * FROM read_parquet('/path/to/CONCEPT_RELATIONSHIP.parquet');
CREATE TABLE concept_ancestor AS SELECT * FROM read_parquet('/path/to/CONCEPT_ANCESTOR.parquet');
-- Create indexes for performance
CREATE INDEX idx_concept_name ON concept(concept_name);
CREATE INDEX idx_concept_std ON concept(standard_concept);
CREATE INDEX idx_synonym_name ON concept_synonym(concept_synonym_name);
CREATE INDEX idx_rel_c1 ON concept_relationship(concept_id_1);
CREATE INDEX idx_rel_c2 ON concept_relationship(concept_id_2);
CREATE TABLE source_concepts AS SELECT * FROM read_csv('/path/to/source-concepts.csv', auto_detect=true);
CREATE TABLE existing_mappings AS SELECT * FROM read_json('/path/to/mappings.json', auto_detect=true, format='array');
Process concepts in batches. For each source concept:
info_json column to understand:
full_name: hierarchical category path (e.g., "Laboratoire / Labo_GDS / PaO2")data_types: "numerical" or "categorical" — this hints at the OMOP domaincategorical_data: possible values (helps identify what the concept represents)numerical_data: min/max/mean/median/unit — helps validate mapping and identify unitsmeasurement_frequency: how often it's recordedtemporal_distribution: date range and yearly trendshospital_units: which hospital services use this conceptUse multiple search strategies in sequence. See the reference file for detailed DuckDB queries.
Strategy 1: Direct name search
Search concept and concept_synonym tables for the English translation of the concept name.
Strategy 2: Semantic keyword search Break the concept name into meaningful clinical keywords and search for combinations.
Strategy 3: Web search (if needed) If the concept is ambiguous or domain-specific, use WebSearch to find:
Strategy 4: Hierarchical exploration
Once a candidate is found, explore its hierarchy using concept_ancestor and concept_relationship to find more specific or more general matches.
For each candidate, assess:
Assign an equivalence level — be rigorous, do NOT default to exactMatch.
SSSOM convention (source = subject, target = object):
skos:broadMatch — the target (object) is broader than the source (subject)skos:narrowMatch — the target (object) is narrower than the source (subject)Reference: SSSOM mapping predicates
Levels:
skos:exactMatch — identical meaning, no information loss (e.g., "SpO2" → LOINC "Oxygen saturation in Arterial blood by Pulse oximetry")skos:closeMatch — very similar but some context/specificity is lost (e.g., "Frequence_respiratoire_mesuree_scope" → "Respiratory rate" loses the "measured by scope" detail; "PEEP_reglee" → "PEEP setting Ventilator" is close but the source implies a specific clinical workflow)skos:broadMatch — target is more general than source (e.g., "Voie intraveineuse directe" → "Intravenous": source specifies bolus IV, target covers all IV routes)skos:narrowMatch — target is more specific than source (e.g., source is a generic category, target covers only a subtype)skos:relatedMatch — related but different angleGuideline: if the source concept name contains qualifying information (measurement method, device, location, timing) that the target concept does NOT capture, this is closeMatch, not exactMatch. Only use exactMatch when the concepts are truly semantically equivalent.
For each source concept, present:
Ask the user to approve, modify, or reject each mapping before writing.
After user approval, update mappings.json with the new mappings.
Each mapping must follow this exact JSON structure (see reference.md for full type definition):
{
"id": "<generate UUID>",
"projectId": "<from project.json>",
"sourceConceptId": <row index from source-concepts>,
"sourceConceptName": "<concept_name from source-concepts.csv>",
"sourceVocabularyId": "<terminology from source-concepts.csv>",
"sourceDomainId": "",
"sourceConceptCode": "<concept_code from source-concepts.csv>",
"sourceFrequency": <record_count>,
"sourceCategoryId": "<full_name from info_json if available>",
"targetConceptId": <concept_id from OMOP>,
"targetConceptName": "<concept_name from OMOP>",
"targetVocabularyId": "<vocabulary_id from OMOP>",
"targetDomainId": "<domain_id from OMOP>",
"targetConceptCode": "<concept_code from OMOP>",
"targetConceptClassId": "<concept_class_id from OMOP>",
"targetStandardConcept": "S",
"equivalence": "<skos:exactMatch|closeMatch|broadMatch|narrowMatch|relatedMatch>",
"status": "unchecked",
"comments": [
{
"id": "<UUID>",
"authorId": "Claude <Model Name>",
"text": "<line 1: brief description of the mapping>\n<line 2: equivalence level + justification — what is preserved or lost>",
// Example: "Respiratory rate measured by ventilator.\ncloseMatch: source is total RR measured by the ventilator; target LOINC 19840-8 specifies 'spontaneous and mechanical' — adds an interpretation not explicit in the source."
// Use \n to separate the description from the equivalence justification.
"createdAt": "<ISO date>"
}
],
"mappedBy": "Claude <Model Name>", // e.g. "Claude Opus 4.6", "Claude Sonnet 4.6" — use the actual model powering this session
"mappedOn": "<ISO date>",
"createdAt": "<ISO date>",
"updatedAt": "<ISO date>"
}
mappings.jsonThe sourceConceptId field should match the row index (0-based) of the concept in the source-concepts.csv, consistent with how the app assigns IDs. Check existing mappings to understand the ID scheme used in this project.
After each batch:
standard_concept = 'S') over classification concepts ('C') or non-standardmappings.jsonconcept_relationship with relationship_id = 'Maps to' to find standard equivalents of non-standard conceptsuuidgen or Python's uuid.uuid4())After the session, remove the temporary DuckDB database:
rm -f /tmp/concept-mapping-session.duckdb