Validate bibliography entries against citations in all lecture files. Structural checks (missing/unused entries, malformed fields) by default; `--semantic` adds citation-drift detection, DOI verification, and style-consistency checks.
Cross-reference citations in lecture files against bibliography entries. Two modes:
--semantic: adds citation-drift detection (duplicate entries for the same paper), DOI verification via crossref, and citation-style consistency within each file.Report saved to quality_reports/bib_audit_[structural|semantic].md.
Read the bibliography file and extract all citation keys.
Scan lecture files for citation keys:
.tex: \cite{, \citet{, \citep{, \citeauthor{, \citeyear{, \textcite{, \parencite{.qmd / .md: @key, [@key], [@key1; @key2]Cross-reference:
.bib..bib but never cited..bib key (e.g., Smith2020 vs Smth2020).Check entry quality:
doi field normalized (no leading https://doi.org/).Write report to quality_reports/bib_audit_structural.md.
Slides/*.tex
Quarto/*.qmd
guide/*.qmd
master_supporting_docs/**/*.tex
Bibliography_base.bib at repo root by default; override via CLAUDE.md.
--semantic)Everything in Mode 1, plus:
Multiple .bib entries describing the same paper under different keys. Symptoms:
Smith2020 + Smith2020a with identical DOI or title.CallawaySantAnna2021 + CS2021 both pointing to the same paper..bib files.Detection heuristics (any → FLAG):
| Check | Signal |
|---|---|
| Same DOI across keys | Hard-duplicate (CRITICAL) |
| Same title (case-insensitive, punct-stripped) | Likely duplicate (CRITICAL) |
| Same author+year+journal | Probable duplicate (MEDIUM) |
| Title Jaccard > 0.85 on tokens ≥ 4 chars | Soft-duplicate (LOW) |
For each flagged pair: list both keys, where each is cited, and recommend a canonical key (prefer most-cited, then alphabetically first).
For each entry with a doi, fetch https://api.crossref.org/works/{doi} and compare:
Severity:
Rate limit: cap 50 lookups per run, 0.5s delay between calls. Cache in quality_reports/.doi_cache.json.
Opt-out: --skip-doi for offline or no-WebFetch environments.
For each file, count citation commands (\citet vs \citep vs \cite; @key vs [@key]). FLAG files with mixed styles without an obvious pattern (e.g., 20× \citep and 3× \cite in the same deck). Low-severity.
Gated behind --cite-claim. For the top-10 most-cited works per file, WebFetch the crossref abstract and surface it beside the in-text context. No auto-judgment — humans decide if the claim matches.
quality_reports/bib_audit_semantic.md)# Bibliography Semantic Audit
**Date:** YYYY-MM-DD
**Bibliography:** Bibliography_base.bib (N entries)
**Files scanned:** [list]
## Summary
| Check | Critical | Medium | Low |
|---|---|---|---|
| Structural | | | |
| Citation drift | | | |
| DOI verification | | | |
| Style consistency | 0 | 0 | |
## Critical Issues
### Duplicate entries
| Keys | Signal | Citations | Recommended canonical |
|---|---|---|---|
### DOI mismatches
| Key | Field | .bib value | crossref value |
|---|---|---|---|
## Medium / Low issues
…
## Next steps
1. Resolve duplicates — pick canonical key, update citations, remove orphans.
2. Fix DOI mismatches — verify paper in crossref or strip the wrong DOI.
3. Review style-consistency notes.
.claude/skills/review-paper/SKILL.md — pair for full pre-submission..claude/skills/audit-reproducibility/SKILL.md — numeric-claims counterpart.--cite-claim surfaces abstracts but does not judge)..bib file — all edits are recommendations.