Evaluate agent performance using a structured scoring rubric
Evaluate agent performance using a structured scoring rubric.
Input: "Evaluate the security-specialist agent effectiveness"
Output: Scorecard: Security (85/100), Architecture (70/100), Ops (75/100), Testing (60/100), Docs (80/100). Total: 370/500. Findings: strong CVE detection but weak test coverage recommendations, documentation quality high but missing escalation follow-through. Top improvement: integrate with testing-specialist for security test gap analysis.