Evaluate LLM-generated mental health responses using a 6-attribute clinical rubric spanning Cognitive Support (Guidance, Informativeness, Safety) and Affective Resonance (Empathy, Helpfulness, Interpretation). Based on Badawi et al. 2026. Trigger phrases: "evaluate mental health response", "assess therapeutic quality", "score counseling output", "rate empathy of LLM response", "audit mental health chatbot", "cognitive-affective gap analysis"
This skill enables Claude to evaluate LLM-generated responses in mental health and therapeutic contexts using the 6-attribute dual-dimension rubric from Badawi et al. (2026). It operationalizes a clinically grounded scoring framework that separately measures Cognitive Support (structured guidance, factual accuracy, safety) and Affective Resonance (empathy, helpfulness, interpretive alignment), exposing the cognitive-affective gap that plagues most LLM therapeutic output. Use this to build evaluation pipelines, audit chatbot responses, score test datasets, or implement quality gates for mental-health-oriented conversational AI.
The paper's core insight is that mental health LLM evaluation must separate two orthogonal