Validates knowledge candidates against quality rules before vault insertion. Stage B step 3. Two-layer verification: deterministic rule checks (schema, R3, R5) followed by LLM-based semantic judgment (R1 evidence sufficiency, R6 duplicate detection, R7 directly-derivable heuristic).
/knowledge-distillery:batch-refine orchestrator after /knowledge-distillery:extract-candidates completes/knowledge-distillery:quality-gateknowledge-gate CLI available (resolve path as described in the knowledge-gate skill — local dev path if available, else ${CLAUDE_PLUGIN_ROOT}).knowledge/vault.db accessible via CLI only (no direct reads)/knowledge-distillery:extract-candidates available in-memory<knowledge-gate> query-domain — existing entries for semantic comparison<knowledge-gate> get — full entry details for conflict analysis<knowledge-gate> search — keyword search for duplicate detection<knowledge-gate> domain-resolve-path — resolve file paths to domains for R7 artifact inspectionRead / Grep / Glob — inspect repo artifacts for R7 derivability verification (scoped to claim verification, not general exploration)| Field | Source | Format |
|---|---|---|
| Candidates | In-memory from /knowledge-distillery:extract-candidates | Array of candidate objects per Candidate Required Schema |
If the candidate array is empty, return an empty verdict array immediately. This is not an error.
Array of verdict objects, one per input candidate:
{
"candidate_id": "<matches candidate.id>",
"verdict": "pass | fail",
"rejection_codes": [],
"curation_queue_entry": null,
"notes": "<human-readable explanation>"
}
verdict is "fail":rejection_codes contains one or more codes from the Rejection Codes tablenotes explains each specific failureverdict is "pass"curation_queue_entry is set:{
"type": "conflict",
"related_id": "<existing vault entry ID>",
"reason": "<why these entries may conflict>"
}
Verification proceeds in two layers. Both layers run for every candidate. Layer 1 failures are immediate FAILs, but Layer 2 still runs to provide complete feedback.
These checks are mechanical — they inspect the candidate structure without interpretation. No LLM judgment needed.
For each candidate, check ALL of the following:
SCHEMA_INVALID — Schema Conformance:
id, type, title, claim, body, applies_to, evidence, considerationsevidence is an empty array → FAILclaim is empty or exceeds 200 characters → FAILbody is missing ## Background or ## Details section headers → FAILR3_NO_ALTERNATIVE — Anti-pattern Requires Alternative:
type == "anti-pattern" AND (alternative is null OR alternative is empty string) → FAILR5_UNCONSIDERED — Considerations Must Not Be Empty:
considerations is null OR empty string OR equals "none" (case-insensitive) → FAILAny Layer 1 failure → immediate "fail" verdict. Layer 2 still runs (to provide complete feedback) but cannot override a Layer 1 failure.
These require semantic understanding and comparison against existing vault content.
Evaluate whether the candidate's claim is adequately supported by the cited evidence:
If the evidence is weak, ambiguous, or does not support the claim → R1_EVIDENCE_INSUFFICIENT
Borderline R1 decisions: Err on the side of rejecting. Better to miss a valid candidate than to insert an unsupported claim into the vault.
Compare the candidate against existing vault entries in the same domains:
Fetch existing entries for each domain in applies_to.domains:
<knowledge-gate> query-domain "<domain>"
If the candidate has a conflict_check value referencing an existing entry, also fetch that entry:
<knowledge-gate> get "<conflict_check_id>"
For each existing entry, classify the relationship:
| Classification | Definition | Verdict |
|---|---|---|
duplicate | Semantically identical claim — same rule, same scope | FAIL (R6_DUPLICATE) |
conflict | Related but contradictory — different conclusion for overlapping scope | PASS + curation_queue_entry |
unrelated | Different topic or non-overlapping scope | PASS (no action) |
If the candidate has a _vault_feedback annotation, use the feedback signals as additional context when classifying the relationship:
| Signal | Classification bias | Usage |
|---|---|---|
outdated | Strengthens conflict | Include in curation_queue_entry.reason |
conflicted | Strengthens conflict | Include in curation_queue_entry.reason |
insufficient | None | Explanatory context in curation_queue_entry.reason only, no influence on duplicate vs conflict decision |
Include the vault feedback context in curation_queue_entry.reason when generating curation queue entries. Preserve _vault_feedback annotations on candidates that pass — downstream stages (batch-refine) consume them for reporting.
Duplicate: Both entries would give an agent the same behavioral guidance. Wording differs but intent is identical.
Conflict: Both address the same scope but prescribe different behavior. Human must decide which is correct.
When unsure: Classify as conflict (safer — routes to human review) rather than duplicate (auto-reject).
R7 is a primary derivability verifier with file-reading capability. It serves as the safety net for candidates that passed extract-candidates criterion 4d — verify claims against actual repo artifacts rather than relying on the candidate's self-description.
Verification procedure:
Locate artifacts. From the candidate's applies_to.domains, evidence references, and claim content, identify the relevant files. Use Grep or Glob to find the code, config, or doc that the claim describes. Read the relevant sections.
Q1 — Derivability (artifact-verified): Is the claim's content visible in the artifacts you just read? Any of the following makes Q1=yes:
Q2 — Residual value: Does the entry's body preserve knowledge that a developer could not infer from the artifacts themselves?
| Q1 | Q2 | Verdict |
|---|---|---|
| yes | no | FAIL (R7_DIRECTLY_DERIVABLE) |
| yes | yes | PASS — artifact shows what, entry preserves why |
| no | — | PASS — knowledge is not visible in artifacts |
Q2 strictness — red flags for false residual value:
Fact-type check: A fact that merely restates what the code does ("X uses Y") without explaining why or defining a boundary → R7 fail. A fact whose "why" is self-evident from the implementation pattern → also R7 fail. A fact that carries genuinely non-obvious rationale or constraint ("When touching X, keep Y because Z" where Z is a past incident, policy decision, or non-obvious tradeoff) → R7 pass.
Borderline R7 decisions: Err on the side of rejecting. The vault should contain only knowledge that is invisible to artifact readers or that preserves reasoning they would otherwise lose. Having rationale text in the entry body is insufficient — the rationale itself must be non-obvious.
For each candidate, produce a verdict:
verdict: "fail"verdict: "pass" with curation_queue_entryverdict: "pass" with curation_queue_entry: nullnotes explaining the outcome in human-readable form. For failures, explain each rejection code.| Code | Layer | Description |
|---|---|---|
SCHEMA_INVALID | 1 (Rule) | Candidate does not conform to required schema |
R3_NO_ALTERNATIVE | 1 (Rule) | Anti-pattern entry missing required alternative |
R5_UNCONSIDERED | 1 (Rule) | Considerations field empty or trivially dismissed |
R1_EVIDENCE_INSUFFICIENT | 2 (LLM) | Claim not adequately supported by cited evidence |
R6_DUPLICATE | 2 (LLM) | Semantically identical to existing vault entry |
R7_DIRECTLY_DERIVABLE | 2 (LLM, artifact-verified) | Candidate content is visible in repo artifacts with no residual value |
| Failure Mode | Behavior |
|---|---|
knowledge-gate CLI unavailable | Cannot perform R6 duplicate check. Skip R6 — pass candidates on other rules. Log warning in notes. |
| Empty candidate array | Return empty verdict array []. Not an error. |
| Single candidate has multiple failures | Report ALL rejection codes, not just the first. |
| Borderline R1 judgment | Err on the side of rejecting. |
| Borderline R6 duplicate | Classify as conflict (human review) rather than duplicate (auto-reject). |
query-domain returns error for a domain | Log warning. Skip R6 check for that domain only. Continue with other domains. |
get returns error for conflict_check entry | Log warning. Skip that specific conflict comparison. Continue. |
{
"candidate_id": "payment-service-object-pattern",
"verdict": "pass",
"rejection_codes": [],
"curation_queue_entry": null,
"notes": "All checks passed. Evidence shows team consensus in LIN-456."
}
{
"candidate_id": "no-direct-db-access",
"verdict": "fail",
"rejection_codes": ["R3_NO_ALTERNATIVE"],
"curation_queue_entry": null,
"notes": "Anti-pattern entries require an alternative approach. What should developers do instead?"
}
{
"candidate_id": "no-api-in-callbacks",
"verdict": "fail",
"rejection_codes": ["R6_DUPLICATE"],
"curation_queue_entry": null,
"notes": "Semantically identical to existing entry 'no-ar-callback-api'. Same rule, same scope."
}
{
"candidate_id": "use-controller-payment-logic",
"verdict": "pass",
"rejection_codes": [],
"curation_queue_entry": {
"type": "conflict",
"related_id": "payment-service-object-pattern",
"reason": "Prescribes controller-level payment logic, contradicting existing rule that mandates Service Objects for payments"
},
"notes": "Passes quality checks but conflicts with existing entry. Queued for human curation."
}
{
"candidate_id": "curate-workflow-branch-prefix-gate",
"verdict": "fail",
"rejection_codes": ["R7_DIRECTLY_DERIVABLE"],
"curation_queue_entry": null,
"notes": "Claim describes a branch naming convention visible in workflow YAML. Q1=yes (derivable from current artifact), Q2=no (no rationale, boundary, or failure context preserved)."
}
{
"candidate_id": "bad-candidate",
"verdict": "fail",
"rejection_codes": ["R3_NO_ALTERNATIVE", "R5_UNCONSIDERED", "R1_EVIDENCE_INSUFFICIENT"],
"curation_queue_entry": null,
"notes": "Multiple failures: (1) Anti-pattern missing alternative, (2) Empty considerations, (3) Evidence is a single commit with no review or discussion."
}
knowledge-gate CLI only_pipeline-insert, _pipeline-archive, _pipeline-update, or _changeset-apply — these mutate vault.db. Verdicts are returned in-memory to the orchestrator.conflict (human review) rather than auto-rejectingBefore returning verdicts, verify:
duplicate (FAIL) from conflict (PASS + curation queue)?knowledge-gate CLI unavailability gracefully?