This skill enables Claude to evaluate LLM-generated responses in mental health and therapeutic contexts using the 6-attribute dual-dimension rubric from Badawi et al. (2026). It operationalizes a clinically grounded scoring framework that separately measures Cognitive Support (structured guidance, factual accuracy, safety) and Affective Resonance (empathy, helpfulness, interpretive alignment), exposing the cognitive-affective gap that plagues most LLM therapeutic output. Use this to build evaluation pipelines, audit chatbot responses, score test datasets, or implement quality gates for mental-health-oriented conversational AI.

When to Use

When the user asks to evaluate or score an LLM's response to a mental health query
When building a quality assurance pipeline for a therapy chatbot or emotional support system
When comparing multiple LLM outputs on therapeutic dialogue and need a structured rubric
When auditing an existing mental health application for clinical safety and empathetic tone
When the user wants to identify the cognitive-affective gap in their model's outputs
When implementing automated or human-in-the-loop evaluation for mental health AI
When designing prompt engineering improvements to boost affective resonance in LLM responses

Assessing Quality of Mental Health Support in LLM Responses

Assessing Quality of Mental Health Support in LLM Responses

When to Use

Key Technique

Taskflow Inbox Triage

Accessibility

Open a Pull Request

Investor Materials

Continuous Agent Loop

Configure Ecc