Name: Observability Agent Runs
Author: lhuasheng

Observability: Agent Run Evidence

Run summary: task type, outcome, hooks fired.
Anomalies: false positives or false negatives with description.
Threshold recommendation: if tuning is warranted, state current value, proposed value, and rationale.

When to Use

After completing a task to record hook outcomes and quality signals.
When a hook fired incorrectly (false positive) or missed a real problem (false negative).
When reviewing aggregate quality trends across a batch of tasks.

Field	Description
Task type	Feature, bugfix, refactor, docs, config
First-pass result	Pass or fail before reviewer handoff
Retry count	Number of fix-validate cycles before acceptance
Hooks fired	Which PreToolUse hooks triggered and whether correctly
False positives	Hooks that denied valid work (note which and why)
False negatives