Name: Standardizing Longitudinal Radiology Report
Author: ndpvt-web

Standardizing Longitudinal Radiology Report

Build LLM-based pipelines that automatically detect and classify longitudinal (temporal) changes in radiology reports. Use when the user mentions 'radiology report annotation', 'longitudinal report evaluation', 'temporal change detection in medical text', 'disease progression extraction', 'radiology NLP pipeline', or 'benchmark radiology report generation'.

ndpvt-web3 스타2026. 2. 13.

직업
카테고리: 머신러닝

Standardizing Longitudinal Radiology Report Evaluation

This skill enables Claude to build automated annotation pipelines that detect longitudinal (temporal) information in radiology reports and classify disease progression across sequential examinations. The core technique is a two-stage LLM pipeline — first identifying sentences that compare current findings to prior studies, then extracting structured disease progression labels (improved / no change / worsened / unmentioned) — replacing brittle rule-based and manual-lexicon approaches with prompt-driven LLM classification that achieves 11.3% and 5.3% higher F1-scores on detection and tracking tasks respectively.

When to Use

When the user asks to build a pipeline that processes radiology reports and extracts temporal changes or comparisons to prior exams
When the user needs to annotate a large corpus of medical reports (e.g., MIMIC-CXR) with structured longitudinal labels for benchmarking
When the user wants to evaluate radiology report generation models on their ability to capture disease progression
When the user asks to classify medical text sentences as longitudinal vs. cross-sectional

Standardizing Longitudinal Radiology Report

ndpvt-web3 스타2026. 2. 13.

직업
카테고리: 머신러닝

Standardizing Longitudinal Radiology Report Evaluation

When to Use

When the user asks to build a pipeline that processes radiology reports and extracts temporal changes or comparisons to prior exams

When the user needs to annotate a large corpus of medical reports (e.g., MIMIC-CXR) with structured longitudinal labels for benchmarking

When the user wants to evaluate radiology report generation models on their ability to capture disease progression

When the user asks to classify medical text sentences as longitudinal vs. cross-sectional

import json from pathlib import Path DISEASE_VOCAB = [ "atelectasis", "cardiomegaly", "consolidation", "edema", "enlarged_cardiomediastinum", "fracture", "lung_lesion", "lung_opacity", "pleural_effusion", "pleural_other", "pneumonia", "pneumothorax", "support_devices" ] STAGE1_PROMPT = """You are a radiology NLP specialist. Determine whether the following sentence from a chest X-ray report contains longitudinal information — i.e., it compares the current finding to a prior study. Return ONLY a JSON object: {"is_longitudinal": true} or {"is_longitudinal": false} Examples: - "Cardiac silhouette is stable." -> {"is_longitudinal": true} - "No acute cardiopulmonary process." -> {"is_longitudinal": false} - "Pleural effusion has increased since prior exam." -> {"is_longitudinal": true} - "Lungs are clear." -> {"is_longitudinal": false} Sentence: {sentence}""" STAGE2_PROMPT = """You are a radiology NLP specialist. Given a sentence that contains longitudinal information from a chest X-ray report, extract which diseases are mentioned and their progression status. Disease vocabulary: {vocab} Progression labels: improved, no_change, worsened Return ONLY a JSON object mapping disease names to progression labels. Only include diseases explicitly mentioned in the sentence. Sentence: {sentence}""" def annotate_report(report_text: str, llm_client) -> dict: sentences = split_into_sentences(report_text) results = [] for sent in sentences: # Stage 1 s1_resp = llm_client.query(STAGE1_PROMPT.format(sentence=sent)) s1 = json.loads(s1_resp) entry = {"sentence": sent, "is_longitudinal": s1["is_longitudinal"]} # Stage 2 (only if longitudinal) if s1["is_longitudinal"]: s2_resp = llm_client.query(STAGE2_PROMPT.format( sentence=sent, vocab=", ".join(DISEASE_VOCAB))) entry["progressions"] = json.loads(s2_resp) results.append(entry) # Aggregate report-level progression report_progression = {d: "unmentioned" for d in DISEASE_VOCAB} for entry in results: for disease, status in entry.get("progressions", {}).items(): report_progression[disease] = status return {"sentences": results, "report_progression": report_progression}

Issue	Cause	Fix
LLM returns free text instead of JSON	Prompt not constraining output format	Add "Return ONLY valid JSON" instruction; use JSON mode if the API supports it; add a regex-based fallback parser
Disease name not in vocabulary	LLM uses a synonym (e.g., "heart enlargement" vs. "cardiomegaly")	Post-process with a synonym mapping dictionary; normalize all disease names to canonical forms
Sentence splitter breaks mid-finding	Medical abbreviations confuse tokenizer (e.g., "Dr.", "approx.")	Use a medical-domain sentence splitter or add abbreviation exceptions to NLTK's Punkt tokenizer
Stage 2 assigns contradictory labels	Same disease appears in multiple sentences with different statuses	Implement a priority resolution rule: `worsened` > `no_change` > `improved` > `unmentioned`, or flag for manual review
Batch processing fails midway	API timeout or rate limit on large corpus	Implement checkpoint-resume: save progress per report_id in JSONL; skip already-annotated reports on restart

Standardizing Longitudinal Radiology Report

Standardizing Longitudinal Radiology Report Evaluation

When to Use

Standardizing Longitudinal Radiology Report

Standardizing Longitudinal Radiology Report Evaluation

When to Use

Key Technique

Step-by-Step Workflow

Concrete Examples

Best Practices

Error Handling

Limitations

Reference

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns