Name: Qual Doctor
Author: octavehq

搜索技能.../

Qual Doctor | Skills Pool

AskUserQuestion({
  questions: [{
    question: "How should I run qualification?",
    header: "Run mode",
    options: [
      { label: "Saved agent (Recommended)", description: "Use a specific qualification agent — tests exact production config including which sections are active" },
      { label: "Raw qualify tool", description: "Use qualify_person/qualify_company directly — tests against your full library" }
    ],
    multiSelect: false
  }]
})

List qualification agents for BOTH types:

list_agents({ type: "QUALIFY_COMPANY" })
list_agents({ type: "QUALIFY_PERSON" })

Present the combined list — the agent type determines person vs company mode.

User picks one. Then fetch full config:

get_agent({ oId: "<selected_agent_id>" })

AskUserQuestion({
  questions: [{
    question: "Are you tuning qualification for people or companies?",
    header: "Qual type",
    options: [
      { label: "Company", description: "Tune how companies are scored (qualify_company)" },
      { label: "Person", description: "Tune how individuals are scored (qualify_person)" }
    ],
    multiSelect: false
  }]
})

QUAL DOCTOR SETUP
=================
Mode:       Company qualification
Run via:    "Qualify Company Agent" (ca_xxx)
Model:      PULSE
Sections:
  Product  → BEST_MATCH (scores, contributes to overall)
  Segment  → BEST_MATCH (scores, contributes to overall)
  Playbook → BEST_MATCH (scores, DOES NOT contribute to overall)
  Persona  → OFF

AskUserQuestion({
  questions: [{
    question: "Which section(s) do you want to tune today?",
    header: "Sections",
    options: [
      { label: "Product/Offering", description: "Tune product-fit scoring (does this company need our product?)" },
      { label: "Segment", description: "Tune segment matching + scoring" },
      { label: "Playbook", description: "Tune playbook ICP matching + scoring" },
      { label: "All active sections", description: "Tune all sections that are enabled" }
    ],
    multiSelect: true
  }]
})

ENTITIES IN THIS SECTION
=========================
Personas (3 active):
  1. VP of Sales — "Sales leader focused on forecast accuracy and team performance..."
  2. RevOps Leader — "Revenue operations professional responsible for GTM infrastructure..."
  3. SDR Manager — "Frontline manager coaching outbound reps on messaging and pipeline..."

Tuning mode: ROUTING + SCORING
  - We'll test whether the right persona gets matched for each test case
  - AND whether the score is correct once matched

Current Qualifying Questions for "Your Product" (product)
====================================================

GOOD FIT questions (should answer YES for good fits):
  #1  [HIGH]    "Is the company operating in a B2B motion..."
  #2  [HIGH]    "Does the company have multiple GTM motions..."
  #3  [MEDIUM]  "Does the company actively run outbound..."
  ...

BAD FIT questions (should answer YES for bad fits):
  #12 [INSTANT_DISQUALIFIER] "Is the company an AI tool for GTM..."
  #13 [HIGH]    "Is the company primarily focused on B2C..."
  ...

Summary: 11 GOOD fit, 8 BAD fit (19 active total)
Weights: 4 HIGH, 6 MEDIUM, 5 LOW, 1 INSTANT_DISQUALIFIER
Archived: 3 questions

AskUserQuestion({
  questions: [{
    question: "How do you want to build the test set?",
    header: "Test cases",
    options: [
      { label: "I have companies/people in mind", description: "Provide names/domains with expected score bands" },
      { label: "Help me find test cases", description: "I'll search for good and bad fit examples" },
      { label: "Mix of both", description: "I have some, help me find the rest" }
    ],
    multiSelect: false
  }]
})

I need test cases in three bands to diagnose your scoring:

1. GOOD FIT (should score 8-10): "If I had 10 more of these, life would be great"
2. BORDERLINE (should score 4-6): "Could go either way"
3. BAD FIT (should score 1-3): "We'd waste each other's time"

For company qual: name + domain
For person qual: name + company + domain (and job title if known)

I also need to know which [persona/segment/playbook] each test case should be
routed to. For each, tell me:
- The expected entity match (which one SHOULD be selected)
- The expected score band (how well they should score against that entity)

Example format:
  Jane Doe (VP Sales @ Snowflake)    → VP of Sales persona,    8-10
  Bob Smith (RevOps @ Shopify)       → RevOps Leader persona,  8-10
  Lisa Chen (SDR Mgr @ Notion)       → SDR Manager persona,    4-6
  Mark Lee (Engineer @ DoorDash)     → None / bad fit,          1-3

Component	Credits	How to check
Base (includes product/offering)	1	Always included
+ Segment section	+1	`entities.segment.strategy === "BEST_MATCH"`
+ Persona section	+1	`entities.persona.strategy === "BEST_MATCH"`
+ Playbook section	+1	`entities.playbook.strategy === "BEST_MATCH"`
+ High effort mode	+4	`tools.highEffortMode.enabled === true`
+ Deep research	+8	`tools.parallelWebSearch.enabled === true`
+ CRM activity	+10	`tools.crmActivity.enabled === true`
+ Custom task	+5	`tools.customTask.enabled === true`

Ready to run N test cases.
Cost per run: X credits (base 1 + [active sections/tools])
Total for this round: X × N = Y credits
Proceed?

Ready to run N test cases. Proceed?

run_qualify_company_agent({ agent: "<agent_oId>", company: { domain: "...", name: "..." } })

run_qualify_person_agent({ agent: "<agent_oId>", person: { firstName: "...", lastName: "...", jobTitle: "...", companyDomain: "..." } })

qualify_company({ companyDomain: "..." })

qualify_person({ person: { firstName: "...", lastName: "...", jobTitle: "...", companyDomain: "..." } })

Running qualification...
  Test 1: Snowflake (snowflake.com)... done (product sub-score: 9, expected: 8-10) OK
  Test 2: Acme Corp (acme.com)... done (product sub-score: 8, expected: 4-6) TOO HIGH ←
  Test 3: Mom's Pizza (momspizza.com)... done (product sub-score: 2, expected: 1-3) OK

RESULTS (Product Fit Sub-Score)
===============================
#   Company              Score   Expected   Verdict
1   Snowflake              9     8-10       OK
2   Acme Corp              8     4-6        TOO HIGH ←
3   Mom's Pizza            2     1-3        OK
4   DataDog                7     8-10       LOW ←

RESULTS (Persona Fit — Routing + Score)
========================================
#   Person              Matched Persona    Score   Expected Match      Exp. Score   Verdict
1   Jane Doe            VP of Sales          9     VP of Sales         8-10         OK
2   Bob Smith           VP of Sales          7     RevOps Leader       8-10         WRONG MATCH ←
3   Lisa Chen           SDR Manager          3     SDR Manager         4-6          LOW ←
4   Mark Lee            VP of Sales          2     None / bad fit      1-3          OK (low score = correct)

#3 Lisa Chen — scored 3, you expected 4-6:
  → Why should this be higher?

#2 Bob Smith — matched "VP of Sales" but you expected "RevOps Leader":
  → What makes Bob a RevOps fit rather than VP of Sales?
  → Is the line between these two personas clear to you, or is it fuzzy?

WHY Acme Corp scored 8 (you expected 4-6):
==========================================

GOOD fit questions pushing the score UP:
  #1 [HIGH]   "500+ employees?"           → YES (HIGH confidence)
  #5 [HIGH]   "Dedicated security team?"   → YES (MEDIUM confidence)

BAD fit questions that SHOULD have pulled it down but didn't:
  #12 [MEDIUM] "Fewer than 500 employees?" → NO — correct, they're large

WHAT'S MISSING: You said Acme Corp should be lower because "they use a competitor."
  → No existing question checks for competitor tool usage.
  → RECOMMENDATION: Add BAD fit question [HIGH weight]:
    "Does the company currently use a direct competitor product in the same category?"
  → Expected impact: drops competitor-using companies by ~1.5-2 points

ENTITY DESCRIPTION CHECK: The description says nothing about competitive landscape.
  → Adding competitive context to the description would help edge case interpretation.

WHY Bob Smith matched "VP of Sales" instead of "RevOps Leader":
================================================================

The agent scored Bob against BOTH personas and picked the higher score:
  VP of Sales:    7  ← selected (higher)
  RevOps Leader:  5  ← expected match

VP of Sales scored higher because:
  #1 [HIGH] "Manages quota-carrying reps?"  → YES (MEDIUM) — Bob manages 2 SDRs
  #3 [HIGH] "Owns pipeline number?"         → YES (MEDIUM) — Bob reports on pipeline

RevOps Leader scored lower because:
  #2 [HIGH] "Owns tech stack decisions?"    → NO (LOW) — couldn't verify
  #4 [HIGH] "Builds/manages dashboards?"    → NO (LOW) — no evidence found

ROOT CAUSE: Bob has a hybrid role (RevOps + some SDR management). The VP of Sales
persona's questions are too broad — managing 2 SDRs shouldn't qualify as "manages
quota-carrying reps" the way a VP with 50 reps does.

RECOMMENDATIONS:
  1. Sharpen VP of Sales Q1: "Manages a team of 5+ quota-carrying sales reps?"
     → Stops hybrid ops roles from matching VP of Sales
  2. Add RevOps Q: "Is the person's primary function building/maintaining revenue
     systems and processes rather than directly managing sellers?"
     → Gives RevOps a stronger signal to win the routing contest
  3. Update RevOps description to mention: "RevOps leaders may manage small teams
     of SDRs or analysts alongside their systems responsibilities"
     → Gives the agent context for hybrid roles

RECOMMENDATIONS (ranked by expected impact)
============================================

1. [HIGH IMPACT] Add question: competitive tool usage
   Type: New BAD fit question, weight HIGH
   Fixes: Acme Corp (#2), SimilarCo (#5)
   Expected effect: Drops competitor-users by 1-2 points

2. [MEDIUM IMPACT] Update entity description: add competitive landscape
   Type: Entity description change
   Fixes: Supports recommendation #1, improves edge case interpretation
   Expected effect: Better context for all competitive questions

3. [MEDIUM IMPACT] Archive Q7: "50+ employees"
   Type: Remove non-differentiating question
   Fixes: Reduces noise across all cases
   Expected effect: Slightly lowers scores for very large companies

4. [LOW IMPACT] Reweight Q4: "Uses no-code automation tools" → MEDIUM
   Type: Weight change
   Fixes: Reduces score volatility from low-confidence answers

AskUserQuestion({
  questions: [{
    question: "Which changes should I apply?",
    header: "Changes",
    options: [
      { label: "Apply all", description: "Make all recommended changes at once" },
      { label: "Let me pick", description: "I'll choose which changes to apply" },
      { label: "None", description: "Just the diagnosis — I'll make changes manually" }
    ],
    multiSelect: false
  }]
})

update_entity({
  entityType: "product",
  oId: "px_xxx",
  instructions: "Add a new BAD fit qualifying question: 'Does the company currently use a direct competitor product in the same category?' with weight HIGH and fitType BAD. Rationale: 'Companies already using a direct competitor are less likely to switch. Check for mentions of competitor tools on their website, job postings, or integration pages.'",
  keyContext: "Testing revealed borderline prospects who use competitor tools score identically to good fits because no existing question captures competitive tool usage."
})

update_entity({
  entityType: "product",
  oId: "px_xxx",
  instructions: "Update the entity description to mention that companies already using a direct competitor in the same category are lower priority prospects. Add this context naturally into the existing description without removing anything.",
  keyContext: "Multiple test cases showed the agent has no context about competitive landscape, leading to inflated scores for prospects using rival tools."
})

Applied change 1 of 3: Added BAD fit question about competitor tools
Applied change 2 of 3: Updated entity description with competitive context
Applied change 3 of 3: Archived Q7 "50+ employees"

BEFORE / AFTER (Product Fit Sub-Score)
======================================
#   Company         Before   After   Expected    Change
1   Snowflake         9       9      8-10        — stable
2   Acme Corp         8       5      4-6         ↓3 FIXED
3   Mom's Pizza       2       1      1-3         — stable
4   DataDog           7       9      8-10        ↑2 FIXED

BEFORE / AFTER (Persona Fit — Routing + Score)
===============================================
#   Person          Before Match → After Match    Before → After   Expected              Verdict
1   Jane Doe        VP Sales → VP Sales             9 → 9          VP Sales, 8-10        stable
2   Bob Smith       VP Sales → RevOps Leader         7 → 8          RevOps Leader, 8-10   ROUTING FIXED
3   Lisa Chen       SDR Mgr → SDR Mgr                3 → 5          SDR Mgr, 4-6          SCORE FIXED
4   Mark Lee        VP Sales → VP Sales               2 → 2          bad fit, 1-3          stable

AskUserQuestion({
  questions: [{
    question: "Scores are closer but not perfect. Want another round?",
    header: "Next",
    options: [
      { label: "Another round", description: "Diagnose again with the updated questions" },
      { label: "Good enough", description: "Scores are acceptable — wrap up" }
    ],
    multiSelect: false
  }]
})

QUAL DOCTOR — COMPLETE
======================
Entity:    "Your Product" (product)
Section:   Offering qualification
Mode:      Score-only (single entity)
Changes:   3 applied (1 archived, 1 added, 1 description update)
Rounds:    2 (initial diagnosis + verification)

Score Improvement:
  Good fits:     9 → 9  (stable)
  Borderlines:   8 → 5  (moved into 4-6 band — fixed)
  Bad fits:      2 → 1  (stable)

Questions: 19 → 19 (1 archived, 1 added)
  GOOD fit: 11 → 11
  BAD fit:  8 → 9

QUAL DOCTOR — COMPLETE
======================
Section:   Persona qualification
Mode:      Routing + Scoring (3 personas)
Entities:  VP of Sales, RevOps Leader, SDR Manager
Changes:   5 applied across 2 entities
Rounds:    2 (initial diagnosis + verification)

Routing Improvement:
  Correct matches:   2/4 → 4/4  (2 routing fixes)

Score Improvement:
  In-band scores:    2/4 → 4/4  (2 score fixes)

Entity Changes:
  VP of Sales:     1 question sharpened
  RevOps Leader:   2 questions added, 1 description update
  SDR Manager:     1 question reweighted

ADDITIONAL INSIGHTS
===================
- LIBRARY GAP: The agent picked "RevOps Leader" for a marketing person.
  You may need a more distinct persona for marketing ops roles.

- SECTION INCONSISTENCY: Product scores a company at 9 but Playbook scores it
  at 2 (too large for playbook ICP). Intentional or misaligned?

- DEEP RESEARCH: 3/5 test cases had LOW confidence on tool-usage questions.
  Enabling deep research on the agent would improve answer quality.

/qual-doctor

Qual Doctor

/qual-doctor - Qualification Agent Tuner

Instructions

Phase 1: Setup

1a: Resolve MCP Server

1b: Determine Execution Mode

Qual Doctor

/qual-doctor - Qualification Agent Tuner

Instructions

Phase 1: Setup

1a: Resolve MCP Server

1b: Determine Execution Mode

1c: Display Agent/Tool Config

1d: Select Sections to Tune

1e: Identify Target Entities + Determine Tuning Mode

1f: Review Current Questions

Phase 2: Collect Test Cases

Option A: User provides test cases

Option B: Help find test cases

Both paths converge on:

Phase 3: Run + Annotate

Present Results Grid

Collect Mismatch Annotations

Phase 4: Diagnose + Fix

4a: Per-Mismatch Deep Dive

4b: Cross-Case Pattern Analysis

4c: Entity Description Analysis

4d: Ranked Recommendations Summary

4e: Apply Changes

Phase 5: Verify

Wrap Up

Cost Reference

MCP Tools Used

Read

Execute

Write

Entity qualifyingQuestions Structure

Qualification Response Structure

Agent Configuration Reference

Error Handling

Examples

Related Skills

Llm Trading Agent Security

Energy Procurement

Council

Carrier Relationship Management

Market Research

Market Research