Demo skill that processes prior authorization requests, performs initial validation checks (NPI, ICD-10, CMS Coverage, CPT), and generates medical necessity assessments using Azure MCP servers.
Process prior authorization requests using AI-assisted review with MCP connectors for NPI Registry, ICD-10 codes, and CMS Coverage policies. This skill generates draft recommendations for human review.
Target Users: Prior authorization specialists, utilization management nurses, medical directors
Key Features:
DRAFT RECOMMENDATIONS ONLY: This skill generates draft recommendations only. The payer organization remains fully responsible for all final authorization decisions.
All AI-generated recommendations require review and confirmation by appropriate professionals before becoming final decisions. Users may accept, reject, or override any recommendation with documented justification.
AI DECISION BEHAVIOR: AI recommends APPROVE, PEND, or DENY. DENY is recommended when a mandatory criterion is clearly NOT_MET (≥90% confidence, based on documented evidence). All decisions require human confirmation in Subskill 2. Decision logic is configurable in the skill's rubric.md file.
COVERAGE POLICY LIMITATIONS: Coverage policies are sourced from Medicare LCDs/NCDs via the mcp-reference-data server (CMS Coverage tools). If this review is for a commercial or Medicare Advantage plan, payer-specific policies may differ and were not applied.
The project uses 3 consolidated MCP servers (down from ~7 individual servers). Each server bundles multiple tool domains into a single Azure Function endpoint.
mcp-reference-data — NPI + ICD-10 + CMS Coverage (12 tools)
/mcp) — routed via APIM or local at http://localhost:7071/mcplookup_npi(npi="..."), validate_npi(npi="..."), search_providers(...)validate_icd10(code="..."), lookup_icd10(code="..."), search_icd10(query="..."), get_icd10_chapter(code_prefix="...")search_coverage(query="...", coverage_type="all", limit=10), check_medical_necessity(cpt_code="...", icd10_codes=[...]), get_coverage_by_cpt(cpt_code="..."), get_coverage_by_icd10(icd10_code="..."), get_mac_jurisdiction(state="...")mcp-clinical-research — FHIR + PubMed + ClinicalTrials (20 tools)
/mcp) — routed via APIM or local at http://localhost:7072/mcpsearch_pubmed(query="..."), search_clinical_queries(query="...", category="therapy"), get_article(pmid="..."), get_article_abstract(pmid="..."), get_articles_batch(pmids=[...]), find_related_articles(pmid="...")search_patients(identifier="..."), get_patient(patient_id="..."), get_patient_conditions(patient_id="..."), get_patient_medications(patient_id="..."), get_patient_observations(patient_id="..."), get_patient_encounters(patient_id="..."), search_practitioners(...), validate_resource(...)search_trials(...), get_trial(nct_id="..."), get_trial_eligibility(nct_id="..."), get_trial_locations(nct_id="..."), search_by_condition(...), get_trial_results(nct_id="...")cosmos-rag — Document RAG & Audit Trail (6 tools)
/mcp) — routed via APIM or local at http://localhost:7073/mcpindex_document(...), hybrid_search(query="..."), vector_search(query="..."), record_audit_event(...), get_audit_trail(workflow_id="..."), get_session_history(workflow_type="...")flowchart TD
subgraph PA["Prior Authorization Workflow"]
subgraph SUB1["Subskill 1: Intake & Assessment (3-4 min)"]
S1A["1. Collect request details<br/>(member, service, provider)"]
S1B["2. Parallel MCP validation via mcp-reference-data:<br/>• NPI tools → Provider credentials<br/>• ICD-10 tools → Diagnosis codes<br/>• CMS tools → Applicable policies + medical necessity"]
S1C["3. Extract clinical data from documentation"]
S1Lit["4. PubMed evidence search (optional)<br/>via mcp-clinical-research"]
S1D["5. Map evidence to policy criteria"]
S1E["6. Generate recommendation (APPROVE/PEND)"]
S1A --> S1B --> S1C --> S1Lit --> S1D --> S1E
end
SUB1 -->|"OUTPUT: waypoints/assessment.json"| HUMAN
HUMAN{"HUMAN DECISION POINT<br/>Review AI findings"}
HUMAN --> SUB2
subgraph SUB2["Subskill 2: Decision & Notification (1-2 min)"]
S2A["1. Human confirms/overrides recommendation"]
S2B["2. Generate authorization number (if approved)"]
S2C["3. Create decision documentation"]
S2D["4. Generate notification letters"]
S2A --> S2B --> S2C --> S2D
end
SUB2 -->|"OUTPUT: waypoints/decision.json, outputs/letters/"| DONE(["Done"])
end
mcp-reference-data (NPI + ICD-10 + CMS — parallel execution)mcp-clinical-research PubMed tools (optional, strengthens evidence)mcp-clinical-research FHIR tools (optional)waypoints/assessment.json (consolidated)mcp-reference-data (NPI + ICD-10 + CMS, parallel), CMS Fee Schedule (web), mcp-clinical-research (PubMed + FHIR, optional)waypoints/decision.json, outputs/determination.json, outputs/approval_letter.md or outputs/pend_letter.md or outputs/denial_letter.mdOption A: User provides files
To begin a prior authorization review, I need:
1. PA Request Form (member info, service details)
2. Clinical Documentation (progress notes, test results)
3. Provider Information (NPI, credentials)
Please provide the file paths or paste the content.
Option B: Demo mode
Would you like to use sample PA request files for demonstration?
Sample files are in: data/sample_cases/prior_auth_baseline/
Use beads to track progress through the prior authorization workflow. Each major phase is a bead with a unique ID, status, and checklist. Update bead status as work progresses to maintain an auditable trail and enable reliable resume.
| Bead ID | Phase | Description |
|---|---|---|
bd-pa-001-intake | Subskill 1, Steps 1-3 | Collect request info + parallel MCP validation |
bd-pa-002-clinical | Subskill 1, Steps 4-6 | Clinical extraction + rubric read + evidence mapping |
bd-pa-003-recommend | Subskill 1, Steps 7-10 | Generate recommendation + write waypoint + audit doc |
bd-pa-004-decision | Subskill 2, Steps 1-5 | Human review + decision capture + authorization |
bd-pa-005-notify | Subskill 2, Steps 6-7 | Generate notification letters + completion summary |
not-started → in-progress → completed
bd-pa-001 → bd-pa-005)waypoints/assessment.json under a "beads" key so progress survives interruptionsInclude bead tracking in waypoint JSON files:
{
"beads": [
{"id": "bd-pa-001-intake", "status": "completed", "completed_at": "ISO datetime"},
{"id": "bd-pa-002-clinical", "status": "completed", "completed_at": "ISO datetime"},
{"id": "bd-pa-003-recommend", "status": "in-progress", "started_at": "ISO datetime"},
{"id": "bd-pa-004-decision", "status": "not-started"},
{"id": "bd-pa-005-notify", "status": "not-started"}
]
}
On startup, if waypoints/assessment.json exists:
"beads" array"completed"Structured prompt guidance is organized into 5 phase-aligned modules in references/prompts/. Each module is loaded lazily — only when its bead starts — and released at the next context checkpoint.
| Bead | Module(s) to Read | Peak Tokens | Release After |
|---|---|---|---|
bd-pa-001-intake | 01-extraction.md + 02-policy-retrieval.md | ~270 lines | Context Checkpoint 1 (waypoint write) |
bd-pa-002-clinical | 03-clinical-assessment.md | ~120 lines | Context Checkpoint 2 (waypoint update) |
bd-pa-003-recommend | 04-determination.md + rubric.md | ~150 lines | Context Checkpoint 3 (waypoint finalize) |
bd-pa-004-decision | (none — human review) | ~0 | Context Checkpoint 4 (decision.json write) |
bd-pa-005-notify | 05-output-formatting.md | ~100 lines | Workflow complete |
Waypoint files serve as context checkpoints — they compress raw data into structured summaries so downstream beads operate on compact representations instead of replaying all upstream context.
| Checkpoint | Trigger | What Gets Compressed | Waypoint Section |
|---|---|---|---|
| CP1 | Bead 001 complete | Raw clinical docs, MCP tool results, extraction prompts | assessment.{request, clinical, policy} |
| CP2 | Bead 002 complete | Policy full text, evidence mapping work, assessment prompt | assessment.criteria_evaluation |
| CP3 | Bead 003 complete | Determination reasoning, rubric rules | assessment.recommendation |
| CP4 | Bead 004 complete | Assessment details, human decision | decision.{decision, rationale, audit} |
Each bead defines what to read and what to ignore:
Bead 001 (Intake): Read raw docs + prompt modules 01, 02. Full context access.
Bead 002 (Clinical): Read assessment.json sections + prompt module 03. Ignore raw docs, MCP results, modules 01-02.
Bead 003 (Recommend): Read assessment.json evaluation + rubric.md + prompt module 04. Ignore raw docs, prior modules.
Bead 004 (Decision): Read assessment.json recommendation only + human input. Ignore everything else.
Bead 005 (Notify): Read decision.json + assessment.json member/service/provider + criteria_evaluation + recommendation.gaps + prompt module 05. Ignore raw clinical documents and raw MCP responses.
The waypoint is the source of truth for downstream beads. Do not re-read raw inputs.
Always read subskill files: Don't execute from memory. Read the actual subskill markdown file and follow instructions.
Load prompt modules per bead: Read the prompt module(s) listed in the Module Loading Table when starting each bead. Do not pre-load modules for future beads.
Auto-detect resume: Check for existing waypoints/assessment.json on startup. If found, read bead state and offer to resume from the first incomplete bead.
Parallel MCP execution: In Subskill 1, execute NPI, ICD-10, and Coverage MCP calls in parallel for optimal performance.
Preserve user data: Never overwrite waypoint files without asking confirmation or backing up.
Clear progress indicators: Show users what's happening during operations (MCP queries, data analysis). Update bead status in real-time.
Graceful degradation: If optional data missing, continue with available data and note limitations.
Validate outputs: Check that waypoint files have expected structure before proceeding.
Track beads: Update bead status at every phase transition. Persist bead state in waypoint files for resume capability.
Respect context scoping: At each bead, read only what the Context Scope Rules specify. Use waypoint data instead of re-reading raw inputs from prior beads.
When invoking MCP tools, always inform the user:
Example:
Searching mcp-reference-data for applicable coverage policies...
✅ CMS coverage search completed - Found policy: L34567 - Knee Arthroplasty LCD
MCP and Validation:
validate_icd10() in a loop if avoidable - consider parallelizing callslookup_icd10() with array parameter - it takes a single code string onlyDecision Policy Enforcement (CRITICAL):
Missing MCP Servers: If required MCP connectors not available, display error listing missing connectors.
Missing Subskill Prerequisites:
If Subskill 2 invoked without waypoints/assessment.json, notify user to complete Subskill 1 first.
File Write Errors: If unable to write waypoint files, display error with file path, check permissions/disk space, and offer retry.
Data Quality Issues: If clinical data extraction confidence <60%, warn user with confidence score and low-confidence areas. Offer options to: continue, request additional documentation, or abort.
Before completing workflow, verify:
bd-pa-001 through bd-pa-005) marked completedSample case files are included in data/sample_cases/prior_auth_baseline/ for demonstration purposes. When using sample files, the skill operates in demo mode which:
A curated evaluation dataset is available in data/ for benchmarking and demo purposes. All data is de-identified/synthetic.
data/
├── cases/
│ ├── ground_truth.json # Aggregated ground truth for all case variants
│ └── {001-005}/{a,b}/ # Case variants
│ ├── results.json # Per-case decision (approved/rejected)
│ ├── pa_form/ # Prior auth request form (PDF)
│ ├── doctor_notes/ # Clinical notes (PDF)
│ ├── labs/ # Lab results (PDF, if applicable)
│ └── imaging/ # Imaging reports (PDF, if applicable)
├── policies/ # Indexed retrieval corpus (coverage policy PDFs)
│ ├── 001.pdf
│ ├── 002.pdf
│ ├── 003.pdf
│ ├── 004.pdf
│ └── 005.pdf
├── sample_cases/
│ └── prior_auth_baseline/
│ ├── pa_request.json
│ ├── ct_chest_report.txt
│ ├── pet_scan_report.txt
│ └── pulmonology_consultation.txt
└── pdfs/ # Pre-rendered page images for select cases
| Case | Variant | Decision | Docs Included |
|---|---|---|---|
| 001 | a | rejected | PA form, notes, labs, imaging |
| 001 | b | approved | PA form, notes, labs, imaging |
| 002 | a | approved | PA form, notes, labs, imaging |
| 002 | b | approved | PA form, notes, labs, imaging |
| 003 | a | approved | PA form, notes, labs |
| 003 | b | approved | PA form, notes, labs |
| 004 | a | approved | PA form, notes |
| 004 | b | approved | PA form, notes |
| 005 | a | approved | PA form, notes, labs, imaging |
| 005 | b | approved | PA form, notes, labs, imaging |
Interactive demo (single case):
Run the prior-auth skill using case data/cases/001/a/ — this is a rejected case with full documentation.
Evaluation mode:
Use data/cases/ground_truth.json to compare skill recommendations against MD-reviewed ground truth decisions across all 10 case variants.
Each case variant includes:
a and b for the same clinical scenario, sometimes with different outcomes (e.g., 001_a rejected vs 001_b approved), enabling testing of the skill's sensitivity to documentation quality and completeness.results.json: Per-variant ground truth with case_id, decision, evaluation_time, and notes.../policies/: Coverage policy PDFs (one per case number) used as the indexed retrieval corpus.{
"<case_id>": {
"doctor_notes": ["cases/.../notes.pdf"],
"imaging": ["cases/.../imaging.pdf"],
"labs": ["cases/.../labs.pdf"],
"pa_form": ["cases/.../form.pdf"],
"policies": [],
"evaluation_time": "ISO datetime",
"decision": "approved | rejected",
"notes": "Evaluation context"
}
}
Note: File paths in
ground_truth.jsonare relative todata/. Thepoliciesfield is currently empty and reserved for linking case-specific policy references in future iterations.
mcp_servers:
mcp-reference-data:
url: https://healthcare-mcp.azure-api.net/reference-data
auth: azure_ad_token
# Serves: NPI lookup, ICD-10 validation, CMS coverage (12 tools)
mcp-clinical-research:
url: https://healthcare-mcp.azure-api.net/clinical-research
auth: azure_ad_token
# Serves: FHIR, PubMed, ClinicalTrials.gov (20 tools)
cosmos-rag:
url: https://healthcare-mcp.azure-api.net/cosmos-rag
auth: azure_ad_token
# Serves: Document RAG, audit trail (6 tools)
Assessment data can be stored as FHIR resources:
{
"resourceType": "Claim",
"use": "preauthorization",
"status": "active",
"patient": { "reference": "Patient/123" },
"insurance": [{ "coverage": { "reference": "Coverage/456" }}]
}