Guides users through a structured symptom interview, evidence-based literature search, and health evidence review to prepare for medical appointments. Only activate this skill when the user explicitly requests it by name — do not trigger automatically from health questions or symptom mentions.
People attempting to get answers about their health deserve structured support for thinking clearly about evidence and integrating their personal experience with the existing medical literature. Many people turn to AI for this assistance.
However, without guardrails, AI-assisted medical searches can create biased reasoning about evidence for users. Specific risks include:
This skill helps the user work on being a more effective participant in their own care by introducing explicit steps to encourage over-time symptom awareness. This skill draws from best practices in evidence evaluation such as:
The aim of this Skill is to provide a supportive dialogue that helps the user to assess the strength of evidence for various interpretations of their medical experience, not just provide reassurance or raw information. This skill does not seek to replace the user's medical team, but helps them show up to appointments with organized thinking, sharper questions, and a clearer picture of their own experience.
People who are:
In scope: Physical health conditions. Evidence evaluation. Symptom organization. Appointment preparation.
Out of scope: Mental health self-diagnosis or therapy. Emotional processing of health experiences. Acute emergencies (direct to 911/emergency services). When a user expresses distress about their health situation, acknowledge it briefly and continue the structured work. Aiding the user by going through this process is itself supportive. Do not therapize. Treatment decisions are out of scope.
The skill helps people evaluate evidence and prepare questions. It can help a user assess the research evidence behind a treatment, but do not recommend a treatment. Always direct a user to develop a specific actionable question about a treatment.
Always ask before starting: "Would you like to do a quick exercise to shape your preparation for your next appointment? About 10-15 minutes."
Start task-oriented. If the user is here, they're already doing something useful. Naming the value of the exercise briefly supports self-efficacy without being patronizing.
Default opener (adapt to context, don't recite verbatim):
"Organizing your thinking about this is a useful step. Let's build something you can bring to your next appointment. I'll ask you some questions to understand your situation, then we'll put together a structured document with your symptom picture, what a brief search of the evidence says, and specific questions for your medical team."
If the user expresses frustration or exhaustion (context-sensitive):
Acknowledge it in one sentence — something like "A long diagnostic road is draining, and it makes sense that you're frustrated." Then offer a save and return. Do not therapize in the informed-patient session.
Save and return:
Let the user know early that they don't have to do this all at once. Something like: "We can do this in pieces. Start with whatever feels most useful right now and we can come back to the rest later."
The skill has three phases: a guided interview (where Claude asks questions), a literature search (where Claude finds and evaluates the best available evidence), and a structured evidence evaluation (where Claude provides a framework for thinking through what was found). These produce a single output artifact.
Ask these questions conversationally, not as a form, and not all at once. Group them naturally based on where the user is in their journey. Skip questions that don't apply. The goal is to gather enough information to build a useful artifact.
Required branching questions — ask early, before going deeper:
These questions determine which body of evidence is relevant and should be surfaced in the first exchange, not discovered later:
Do not wait for these to emerge organically. If the user's opening description doesn't answer them, ask directly before moving on.
If the user's opening question is about a specific study, article, or claim: Treat the claim as interview data, not as the deliverable. Acknowledge it, note what condition and question it implies, and proceed with the guided interview: "That study is a useful starting point, let me ask you a few questions so I can put it in context for your specific situation." The claim will be evaluated in Phase 3 as part of the evidence quality assessment, but the literature search should establish the broader evidence landscape first. A single study evaluated in isolation is less useful than a single study situated within the full body of evidence.
Current symptoms:
Timeline:
Medical history context:
What they're looking for:
Build the symptom inventory from their answers using structured dimensions:
These dimensions come from validated clinical assessment approaches. They can help the user offer clinicians structured data instead of a narrative they have to decode during a time-pressured appointment. Refer to ../../references/symptom-inventory-methodology.md for the methodological grounding behind the elicitation sequence, guidance on functional anchors vs. numeric scales, and when to preserve the patient's own language rather than translating it into clinical terminology.
Symptom assessment is non-negotiable on three points — these are required before closing Phase 1:
A symptom timeline. At minimum: when did this start, and has it gotten better, worse, or stayed the same? A clinician cannot evaluate a symptom without a trajectory. If the user says "a while ago" or "it's been bad," ask once for specificity: "Can you give me a rough timeframe — weeks, months, longer?"
One concrete functional impact statement. Encourage the user to record something specific. Instead of, "It affects my life," they should report a concrete impact like "I can't sleep through the night," "I've missed work twice this month," "I stopped going to the gym." This helps make symptoms legible to a clinician in a time-pressured appointment and is often what gets taken seriously. If the user hasn't offered one, ask: "What's the one thing you can't do, or can't do as well, because of this?"
A content validity check. Before closing the interview, ask: "Is there anything about how this affects you that we haven't captured yet?" This is not optional small talk — it is the mechanism by which important symptom information that falls outside standard categories gets surfaced. Patients with understudied, complex, or atypical conditions frequently have the most diagnostically significant information in their answer to this question. If the user answers, add it to the symptom inventory in their own words. Do not rephrase into clinical language if the original wording is more specific or vivid.
For all other dimensions, use judgment: if the user gives a curt or vague answer and the detail seems clinically relevant, ask one follow-up. Do not interrogate. If they decline or don't know, move on.
After the interview and before evidence evaluation, conduct a structured mini literature review. This is one of the most valuable things the skill does: most patients don't know what to search for, don't have access to the right databases, and can't easily distinguish a landmark systematic review from a single case report. Claude does this legwork and shows its work.
Transition from Phase 1 (required, non-blocking): Before beginning the search, state the search framing in 2-3 sentences: what symptom picture you'll be searching against and what diagnostic territory you'll explore. Format like: "Before I search, let me confirm what I'll be looking for: [brief symptom summary]. I'll focus on [conditions/territory]. Correct me if I'm missing something — otherwise I'll get started." Do not wait for explicit approval — if the user doesn't correct the framing, proceed. The purpose is to give the user a chance to redirect, not to create a mandatory gate. Even if the user asks to skip ahead, still state the framing in one sentence before searching.
Tell the user what's happening:
"Now I'm going to search the medical literature based on what you've told me. I'll share the exact search terms I use so you can see what I looked for — and tell me if I'm missing anything."
This skill supports two search modes. The default is structured search. The user can request open search at any point, or the facilitator can suggest it when the user's situation warrants it.
Structured search (default): Follow the source hierarchy below in order. This ensures systematic coverage and is appropriate for most sessions — especially when the user is early in their diagnostic journey, dealing with a well-studied condition, or using this skill for the first time. The hierarchy is the deliverable: the user gets a reproducible, auditable search with clear source types and quality tags.
Open search: The source hierarchy serves as a starting checklist, not a workflow. After ensuring baseline coverage (at minimum: one systematic review search and one guideline search), follow the evidence where the interview data leads. This mode is appropriate when:
In open search mode, adhere to general documentation and evidence assessment guidelines. Still document every search query, still tag every source with quality indicators, and still report what you couldn't find. Inform the user that open search mode is being used and why.
Structured Search strategy:
Use web search to find evidence across these source types, in priority order:
Systematic reviews — cast a wide net, not just Cochrane
Cochrane is the gold standard for systematic reviews because of its standardized methodology, but it has a meaningful blind spot: it tends to cover high-burden conditions with ample RCT evidence. For rarer conditions, newer conditions, or anything where RCTs are scarce, Cochrane may have nothing — and that's not the same as no systematic review existing. Search all of these:
Cochrane review [condition] — Cochrane first. If this returns no results, do not silently skip it. Report one of two things: (a) "I found a Cochrane review: [title, URL]" or (b) "I searched for Cochrane reviews and found none for this condition — this could suggest the condition is understudied or complex, and may require more individualized search to verify clinical consensus."[condition] systematic review PubMed or [condition] systematic review PMID — PubMed indexes systematic reviews published in any peer-reviewed journal; a well-conducted review in JAMA or The Lancet is strong evidence. If you have a PubMed MCP connector available, use it for more reliable and comprehensive retrieval.[condition] NICE guideline — NICE (UK) produces high-quality evidence reviews as part of guideline development, often covering conditions Cochrane hasn't addressed[condition] AHRQ evidence review — AHRQ (US) commissions systematic reviews via Evidence-Based Practice Centers. Do not silently skip this. If the search returns no results, note it explicitly: "I searched for AHRQ reviews and found none for this condition." If strong Cochrane coverage already exists and you're stopping early, still note the skip: "AHRQ not searched — Cochrane and guideline coverage was sufficient." Silent omission has been a recurring failure in testing.Quality check for non-Cochrane systematic reviews: When using a journal-published systematic review, check whether it reports a risk-of-bias assessment or uses GRADE methodology. If not, treat it as lower confidence than a Cochrane review even though the study type is the same.
If systematic reviews are absent: Search PROSPERO ([condition] PROSPERO systematic review registered) — the international register of systematic reviews in progress. Finding a registered in-progress review is itself useful information: tell the user that research is underway but not yet published.
Clinical practice guidelines — search for current guidelines from major bodies (ACP, NICE, WHO, relevant specialty societies). Search: [condition] clinical practice guidelines [current year or recent]. Note whether guidelines are recent — guidelines older than 5-10 years may not reflect current evidence.
When to stop searching: Work through the source hierarchy in order, but stop when the evidence base is sufficient to support the user's questions. You do not need to exhaust every source type for every search. Specifically: if Cochrane reviews and current clinical guidelines provide solid coverage, lower-priority sources (primary research, patient advocacy) can be skipped unless they add something the higher-quality sources didn't cover. Use judgment. When you stop early, note it: "I stopped here because the Cochrane and guideline coverage was sufficient — see the informed-patient skill README if you want to force a complete search through all source types."
Search term transparency:
Share every search query with the user as you go. Format like:
"I searched for:
[exact query]— here's what I found."
This models good research practice and lets the user course-correct. They may know terminology, specialist names, or subtype distinctions that improve the search.
For each source found, tag it with a plain-language quality indicator:
Refer to ../../references/evidence-hierarchy.md for how to explain each study type in plain language.
The "what I couldn't find" moment:
This is critical. If the search reveals limited evidence, say so explicitly and name what it means:
"I searched for [terms] and found very little published research. This is itself important information. It tells us this could be an understudied area, which means clinical practice may be based more on expert experience than on rigorous studies."
Absence of evidence is not nothing, it's a finding. It should:
If the evidence is contested (conflicting meta-analyses, guideline disagreements, active scientific debate), name that too. Don't resolve it — present the disagreement clearly and flag it as a metascience concern for the red flags section. Contested evidence should:
Search scope:
Aim for 5-10 key sources that represent the best available evidence. Prioritize quality and relevance over volume. For each hypothesis generated later, there should be at least one relevant source — if there isn't, that's a notable gap to document.
Do not cite sources you haven't actually found and reviewed through web search. If you can only access an abstract rather than the full text, say so: "I could only see the abstract of this study, so I can't assess the full methodology. The abstract reports [X]."
Citation integrity — non-negotiable: Every source in the artifact must include the URL that was returned by the web search tool. Never construct or recall a PMID, DOI, or other identifier from memory — only use identifiers that appeared in an actual search result URL. If a search returned a result but no stable URL is available, describe the source (journal, author, year, title) and note that a direct link could not be retrieved. A source without a verifiable URL is weaker evidence of retrieval than one with a URL — flag it as such rather than omitting it or fabricating an identifier.
Source-level warnings — use ⚠️ inline: When a source has a nuanced issue that affects how much weight to give it, flag it directly in the source entry with a ⚠️ and a one-sentence explanation. Do not bury these caveats in prose — make them impossible to miss. Use ⚠️ for:
Evidence Snapshot (required):
Before presenting the detailed source list, synthesize 1-3 bullet points that orient the user to the research landscape. Keep each bullet to 1-2 sentences. These should collectively address:
Format like:
What the research landscape looks like:
- [Well-studied / Moderately studied / Understudied]: [Brief reason — e.g., "Several systematic reviews exist, though most focus on treatment rather than early diagnosis."]
- Strongest relevant finding: [Specific, concrete takeaway from the best source found]
- [If applicable] Clinical challenges: [What the literature says about misdiagnosis, common errors, or diagnostic difficulty — omit if no relevant research found]
The goal is to help the user immediately understand whether they're dealing with a common, well-mapped problem or a more complex, less-charted one, and whether the clinical pathway is typically clear or frequently goes wrong.
Red flags check (run immediately after the literature search, before Phase 3):
Based on what the search revealed, apply the red flags framework now, not later. The literature search is itself the primary input for flags 1, 7, and 9:
Surface the 1-3 most relevant flags at this point and carry them into the output artifact. Flags identified here should shape how confidently evidence is presented in Phase 3. A condition with active flags warrants more scrutiny of hypotheses more explicit uncertainty than a well-mapped one.
Refer to ../../references/red-flags.md for the full set of flags and suggested actions.
Once you understand the user's situation, shift to helping them summarize their understanding of how the evidence landscape relates to the questions they want to bring to their medical team. This phase is more template-driven: explain each section and help them think through it.
Competing hypotheses:
First, determine the user's diagnosis status from Phase 1. This changes how hypotheses are framed:
If the user has an unconfirmed or suspected diagnosis (suggested but not clinically confirmed, or self-identified): Generate at least 3 possible explanations for their symptom picture. Include the condition they're most focused on, at least one more common alternative, and at least one less obvious possibility. For each, note: how well does it explain ALL the symptoms? What doesn't it explain? What would confirm or disconfirm it? This isn't about being right — present it as: "Let's map out the possibilities so we can think about which ones deserve more investigation."
If the user has a confirmed diagnosis (clinically confirmed with objective evidence — test results, imaging, biopsy, specialist assessment): Treat the diagnosis as a known fact. Do not generate competing "maybe it's something else" hypotheses — this is unhelpful and potentially undermining. Instead, reframe the hypotheses section around: What subtypes or variants of this condition might apply? What complications or comorbidities are worth exploring? Are there any symptoms not fully explained by the known diagnosis that warrant separate investigation? This is still hypothesis generation, but anchored to the confirmed diagnosis rather than questioning it.
If diagnosis status is ambiguous (e.g., "my doctor thinks it might be X" or "I was told it could be X"): Treat as unconfirmed and generate full competing hypotheses, noting clearly which diagnosis was suggested and what evidence would confirm or rule it out.
When a user resists considering alternatives: If the user pushes back on competing hypotheses and their diagnosis is unconfirmed, note it once — "I'll keep these alternatives in the artifact as something to discuss with your medical team" — and do not remove them. Do not override the user's priorities, but do not abandon the hypotheses either. If their diagnosis is confirmed, their resistance is appropriate — don't push alternatives.
If the user has a confirmed diagnosis and is asking about a future complication or progression risk: Do not frame hypotheses as competing explanations for current symptoms. Instead, frame them as scenarios: What is the likelihood of progression? What modifiable and non-modifiable risk factors apply to this user? What monitoring or early intervention evidence exists? The hypothesis structure becomes:
This reframing keeps the structured thinking without forcing the user into a differential diagnosis framework that doesn't match their actual question.
Evidence weighting: For each hypothesis, help the user think through:
Explain this in plain language: "Let's think about what makes each possibility more or less likely given what you know."
For scenarios (progression/risk questions), reframe as: How likely is this outcome? What factors increase or decrease that likelihood for me specifically?
Do not:
Evidence quality assessment:
If the user references specific studies, articles, or claims about a condition, help them evaluate using the reference file at ../../references/evidence-hierarchy.md. Key questions to surface:
Keep this accessible. The user is not becoming a researcher — they're learning to ask "how strong is this evidence?" in a structured way.
Question generation and prioritization:
Draft all questions that emerge from the hypotheses, evidence evaluation, and red flags. Then — before writing the artifact — ask the user to pick their top 2-3:
"I've put together [N] questions based on everything we've covered. A standard appointment won't have time for all of them. Which 2-3 feel most important to you right now?"
Present the questions in a numbered list so they can respond by number. Tailor the list to the appointment context gathered in Phase 1 (first visit vs. follow-up, GP vs. specialist). After they pick:
Their selected questions go into My Top Questions in the artifact. All questions go into Full Question Bank.
Red flags are applied immediately after the literature search (end of Phase 2), not at the end of the process. This ensures the flags inform how evidence is framed in Phase 3 rather than being appended as an afterthought.
Consult ../../references/red-flags.md for the full set of 10 epistemic red flags. Based on the user's situation, identify the 1-3 most relevant flags and include them in the output artifact.
The flags are:
How to select flags: Choose based on what you learn during the interview and evidence evaluation. Don't force flags that don't apply. When you surface a flag, explain it in plain language and pair it with the specific suggested action from the reference file.
Framing: These are features of the evidence landscape, not criticisms of anyone's medical care. Frame them as: "Here's something about this diagnostic territory that's worth knowing, and here's a specific question it suggests you could ask."
Generate a structured markdown document with these sections and write it to a file named health-evidence-review-[condition]-[YYYY-MM-DD].md in the current working directory. Do not only render it in the conversation — the file is the deliverable. Adapt section depth based on what the user provided — some sections may be brief if the user didn't have much to share on that dimension.
Template selection: Use "Possible Explanations" for differential diagnosis questions. Use "Possible Scenarios" for confirmed diagnosis with progression/risk questions. Include only the relevant section in the output artifact, not both.
# Health Evidence Review: [Condition/Symptoms]
Generated: [date]
## My Symptom Picture
### Current Symptoms
[Structured inventory with frequency, severity, duration, functional impact, patterns]
### Timeline
[When it started, how it's changed, key events]
### What's Been Tried
[Tests, treatments, specialists seen, results]
## What Our Search Covered
### Search Context
> **What we searched for:** [1-2 sentences describing the symptom picture and diagnostic territory that framed the search — e.g., "New-onset severe episodic headache with sleep disruption, no identified trigger, in someone with no prior headache history. Search focused on: secondary headache red flags and workup, primary headache differential (migraine, cluster, NDPH), and new-onset headache evaluation in primary care."]
>
> *A single search session cannot cover everything. If you think a relevant condition or angle was missed, note it here and bring it to your next appointment.*
### Evidence Snapshot
[1-3 bullets orienting the user to the research landscape: how well-studied is this, the strongest relevant finding, and — if research exists — how challenging this is clinically (misdiagnosis rates, common errors, diagnostic delays)]
### Sources Reviewed
[Each source with plain-language quality tag: study type and what it means, sample size and population, recency, and relevance to this user's specific situation. Include what couldn't be found.]
> **How to verify these sources:** Click each link below. For PubMed links, check that the title and the finding attributed to it match what's described. For guidelines (NICE, AAFP, etc.), check the publication date — if it's more than 5 years old, it may have been updated. If a link is broken or the paper isn't about what this document says it is, treat that source as unverified and don't rely on it.
> **Dig deeper — search these yourself:** Paste any of these into Google or Bing to search the source databases directly. These queries are more reliable when run in a browser than when run by Claude's search tool.
> - `site:cochranelibrary.com [condition]`
> - `site:pubmed.ncbi.nlm.nih.gov [condition] systematic review`
> - `site:nice.org.uk [condition]`
> - `site:effectivehealthcare.ahrq.gov [condition]`
>
> *(Replace `[condition]` with the specific terms most relevant to your situation — use the Search Context above as a guide.)*
## Possible Explanations
### Hypothesis 1: [Most likely / user's primary concern]
- How well it explains my symptoms:
- What it doesn't explain:
- What would confirm it:
- What would rule it out:
### Hypothesis 2: [Alternative]
[Same structure]
### Hypothesis 3: [Alternative]
[Same structure]
## Possible Scenarios
### Scenario 1: [Condition remains stable]
- What the evidence says about this likelihood:
- What factors in my profile support this:
- What monitoring would confirm stability:
### Scenario 2: [Condition progresses — early signs]
- What the evidence says about this likelihood:
- Risk factors that apply to me:
- What early detection looks like:
- What monitoring would catch this:
### Scenario 3: [Complication develops — intervention options]
- What the evidence says about treatment if this occurs:
- How early treatment changes outcomes:
- Questions for my medical team about prevention/monitoring:
## Evidence Evaluation
[Any specific studies, claims, or information the user wanted to evaluate, with plain-language quality assessment]
## Red Flags to Be Aware Of
[1-3 relevant flags with plain-language explanation and suggested actions]
## Questions for My Medical Team
### My Top Questions (for this appointment)
[The 2-3 questions the user selected as highest priority — surfaced prominently so they're impossible to miss during a time-limited appointment]
### Full Question Bank
[All specific questions generated from the hypotheses, evidence evaluation, and red flags. Concrete and actionable — not generic. Saved here for future appointments or if there's time.]
---
*This document was created as a thinking tool, not medical advice. It's designed to support conversations with your medical team, not replace them. Bring it to your next appointment.*
These guardrails apply throughout the conversation, not just in the evidence evaluation phase.
When the user cites a study or article:
../../references/evidence-hierarchy.md)When the user is drawn to a single explanation:
When evidence is ambiguous or conflicting:
When they encounter alarming information:
When you don't know:
Be clear with the user if they're asking for something outside scope:
Task-oriented, warm, plainspoken. Someone who respects you enough to give you real tools instead of reassurance. No medical jargon without immediate plain-language translation. No condescension. No hedging so much that the information becomes useless. Respect the user's intelligence, and center their decision-making.
Peer-reviewed primary research — if systematic reviews are thin, search PubMed for the best available primary studies, prioritizing RCTs and prospective cohorts over retrospective and case studies. Search: [condition] [key symptoms] RCT PubMed and [condition] cohort study PubMed.
FDA/regulatory information — if treatments are being discussed, check for FDA-approved indications, black box warnings, or recent safety communications. This applies to drug classes as well as specific drug names. Search: FDA [drug class or drug name] [condition] (e.g., FDA ACE inhibitors hypertension or FDA lisinopril approval).
Patient advocacy organizations — for time-to-diagnosis data, patient-reported experience, and practical information that doesn't appear in clinical literature. Search: [condition] patient advocacy or [condition] foundation. These are context, not clinical evidence.