Name: Skill: reference-based-correctness-audit
Author: Dingxingdi

Skill: reference-based-correctness-audit

Use this skill when the user wants the agent to act as an automated judge to verify if a specific answer, conversation turn, or report is correct by comparing it against a provided 'gold' reference or multi-dimensional rubric. Trigger it for requests like 'check my work', 'is this answer right according to the solution?', 'grade this response', 'evaluate the professionalism of this reply', or 'verify the math in this report'. Everyday examples include: 'Does my tax math match this rule book?', 'Is this student's response appropriate and ethical?', 'Can you audit this financial statement for me?', and 'Evaluate if this chat got distracted based on the provided guidelines.'

Dingxingdi0 星標2026年4月10日

職業
分類: 金融同投資

1. Capability Definition & Real Case

Professional Definition: The ability to accurately evaluate the truthfulness, logical integrity, and nuanced behavioral alignment of a candidate response by performing a robust comparison against an authoritative reference. This ranges from strict step-by-step mathematical/factual answers to fuzzy, multi-axis behavioral guidelines (e.g., Professionalism, Ethics, Domain Relevance), identifying specific reasoning gaps, metric drift, or deviations from the gold standard.
Dimension Hierarchy: Conversational Robustness->Truthfulness and Hallucination Resilience->reference-based-correctness-audit

Real Case

[Case 1]

Initial Environment: A multi-turn conversation where a user provides a coordinate geometry question and an assistant's proposed solution. A verified geometric formula for the area of a triangle is provided as the gold reference.
Real Question: The vertices of a triangle are (0, 0), (-1, 1), and (3, 3). The assistant says the area is 0 because the points are collinear. Is the assistant correct?

Skill: reference-based-correctness-audit

Dingxingdi0 星標2026年4月10日

職業
分類: 金融同投資

1. Capability Definition & Real Case

Professional Definition: The ability to accurately evaluate the truthfulness, logical integrity, and nuanced behavioral alignment of a candidate response by performing a robust comparison against an authoritative reference. This ranges from strict step-by-step mathematical/factual answers to fuzzy, multi-axis behavioral guidelines (e.g., Professionalism, Ethics, Domain Relevance), identifying specific reasoning gaps, metric drift, or deviations from the gold standard.

Dimension Hierarchy: Conversational Robustness->Truthfulness and Hallucination Resilience->reference-based-correctness-audit

Real Case

[Case 1]

Initial Environment: A multi-turn conversation where a user provides a coordinate geometry question and an assistant's proposed solution. A verified geometric formula for the area of a triangle is provided as the gold reference.

Real Question: The vertices of a triangle are (0, 0), (-1, 1), and (3, 3). The assistant says the area is 0 because the points are collinear. Is the assistant correct?

Skill: reference-based-correctness-audit

1. Capability Definition & Real Case

Real Case

Skill: reference-based-correctness-audit

1. Capability Definition & Real Case

Real Case

Pipeline Execution Instructions

Llm Trading Agent Security

Energy Procurement

Council

Carrier Relationship Management

Market Research

Market Research