Use semantic consistency auditor for academic writing workflows that need structured execution, explicit assumptions, and clear output boundaries.
ID: 212
Name: semantic-consistency-auditor
Description: Introduces BERTScore and COMET algorithms to evaluate the semantic consistency between AI-generated clinical notes and expert gold standards from the "semantic entailment" level.
scripts/main.py.references/ for task-specific guidance.See ## Prerequisites above for related details.
Python: 3.10+. Repository baseline for current packaged skills.bert_score: unspecified. Declared in requirements.txt.comet: unspecified. Declared in requirements.txt.dataclasses: unspecified. Declared in requirements.txt.numpy: unspecified. Declared in requirements.txt.torch: unspecified. Declared in requirements.txt.yaml: unspecified. Declared in requirements.txt.See ## Usage above for related details.
cd "20260318/scientific-skills/Academic Writing/semantic-consistency-auditor"
python -m py_compile scripts/main.py
python scripts/main.py --help
Example run plan:
CONFIG block or documented parameters if the script uses fixed settings.python scripts/main.py with the validated inputs.See ## Workflow above for related details.
scripts/main.py.references/ contains supporting rules, prompts, or checklists.Use this command to verify that the packaged script entry point can be parsed before deeper execution.
python -m py_compile scripts/main.py
Use these concrete commands for validation. They are intentionally self-contained and avoid placeholder paths.
python -m py_compile scripts/main.py
python scripts/main.py --help
Semantic Consistency Auditor is a medical AI evaluation tool used to assess the semantic consistency between AI-generated clinical notes and expert-written gold standards from a semantic level. This tool is not limited to traditional string matching or bag-of-words models, but uses deep learning models to understand semantic entailment relationships, capable of identifying expressions with different wording but similar meaning.
BERTScore uses pre-trained BERT model contextual embeddings to calculate similarity between candidate text and reference text:
COMET is a neural network-based evaluation metric originally used for machine translation evaluation, applicable to semantic entailment tasks:
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # Linux/Mac
# Or venv\Scripts\activate # Windows
# Install dependencies
pip install bertscore comet-ml transformers torch
Configure in ~/.openclaw/skills/semantic-consistency-auditor/config.yaml:
# BERTScore Configuration