Conducts project-grounded technical research and writes durable reference material into `context/references/`. Use when asked to research an algorithm, architecture, implementation strategy, tooling option, protocol, subsystem pattern, or comparison such as `event sourcing for this project`, `Redis vs Postgres for this repo`, or `research best practices for implementing X here`. Reads the project's `context/` first, inspects relevant code to verify current reality, performs real external research through WebSearch and WebFetch against primary sources (not repo-only synthesis), and produces one paper or a small reference folder when scope warrants it. Not for quick summaries, generic web lookups, release notes, or bug fixing.
Maintain context/references/ as the repository's durable research layer. The goal is not generic topic explanation, and the goal is not "write a thorough-sounding paper from context/ alone." The goal is a real research pass that starts from project reality, expands into external literature and real-world implementations through actual WebSearch and WebFetch calls, then returns with repository-relevant guidance that remains useful after the session ends.
Do not invent a repo-only research mode as the default. Internal inspection is mandatory grounding, not the main research surface.
Read the reference files in this order.
Always read first (mandatory-core):
references/operating-modes.md — the three supported workflows, the default scope rules, and the two-layer sufficiency model (floors + ceiling).references/source-priority-and-evidence.md — how to combine project evidence, code inspection, and external sources without blurring them, including the contrasting-source obligation.references/project-grounding-workflow.md — how to anchor the research in and targeted code inspection, including the 4.25 pattern-breaking checkpoint and the 4.5 motivated-reasoning check at the grounding → external-research boundary.context/references/tool-asymmetry-and-investigation.md — why this skill's pretraining bias points away from its own core tools (WebSearch, WebFetch) and how to correct for it.references/motivated-reasoning-defence.md — the reasoning signatures that precede skipped external research and the adversarial-sweep move that defuses them.Read when relevant:
references/decomposition-rules.md — when to write one paper, when to create a topic folder.references/section-patterns.md — the highest-value sections and how to use them selectively.references/formatting-patterns.md — richer markdown structures for comprehension.references/anti-patterns.md — the failure modes that make research look impressive while saying little.references/examples.md — worked examples of stronger output shapes.references/script-fallbacks.md — what to do when a bundled script cannot run.This skill is not a generic summariser, search assistant, or literature-dump generator.
It exists to produce project-grounded external research that:
context/references/ as durable project memory.A research artefact should leave a reader with a complete understanding of the topic as it relates to this project — the mechanism, the trade-offs, the alternatives, the relevant evidence, and the concrete recommendations.
It must reject:
When trade-offs exist, optimise in this order:
The "do real external research" obligation is not a tone, adverb, or judgement call. It is a set of tool-call floors. An artefact is not complete until all of the following are satisfied, and the completion report must cite each by tool name plus specific query or URL:
WebSearch calls against topic-specific queries (not paraphrases of each other). List exact query strings in "External Research Trail" in the artefact.WebFetch calls against primary sources identified by the searches. "Primary source" means at least one of: foundational paper, official documentation, strong reference implementation, peer-reviewed benchmark. A set of blog aggregators alone does not satisfy this.references/source-priority-and-evidence.md.These are floors, not ceilings. For comparative or high-stakes decision research, 6–10 of each is typical.
If you genuinely cannot perform the floor (no network, tool unavailable), state this explicitly at the top of the artefact in a block titled "Research Floor Unmet — Reasons", naming which obligations were unmet and why. Silent incompleteness is not permitted.
See references/tool-asymmetry-and-investigation.md for why this floor is written as counts rather than as "do substantial research": the pretraining bias against WebSearch and WebFetch reliably defeats exhortation-level standards. Only a counted floor survives it.
The floor above is enforced at instruction level and through the validator script. For automated or subagent use where an external circuit-breaker is wanted, references/hook-integration.md documents an optional Claude Code Stop-hook shape that checks the floor at the transcript level. Hooks are complementary to the validator (transcript behaviour vs artefact structure) and optional in interactive use.
Write into context/references/ using stable topic ownership.
Default patterns:
context/
└── references/
├── event-sourcing.md
├── rendering-pipeline.md
├── storage-engine-comparison.md
└── caching-strategies/
├── overview.md
├── application-layer-caching.md
└── redis-vs-memcached.md
Prefer one file when the topic has one stable question and one coherent answer.
Prefer a topic folder when:
Do not create filler files just because a folder exists.
Prefer lowercase hyphenated filenames and folder names.
Use the shortest stable topic name that remains unambiguous in context/references/.
Good examples:
event-sourcing.mdredis-vs-memcached.mdecs-scheduling.mdnetwork-stack/network-stack/retry-strategy.mdAvoid:
research.md,Choose one of these modes:
Project-grounded external research
Default. Research a topic across primary sources, then translate into project-specific guidance.Project-grounded comparative decision research
Use when the core question is a comparison, trade-off, or selection for this repository.Update / merge / extend existing research
Use when refining, consolidating, cross-referencing, or restructuring the existing research corpus in context/references/. Subject to the self-refinement guardrail in references/operating-modes.md §3 — an Update pass requires new external evidence, new repository state, or a named factual correction. Pure rewording is not a valid Update.Treat the repository as the source of truth for what is implemented now.
Use this evidence ladder:
context/ for current durable project memory,WebSearch and WebFetch) for broader research,Never blur these together. A good artefact makes it obvious:
This skill is script-backed. The bundled scripts are mandatory parts of the workflow.
scripts/init_research_artifact.py before writing a new paper or new topic folder. Capture its stdout for the obligation audit.scripts/validate_research_artifact.py before presenting any completed creation or update. Capture its stdout for the obligation audit. Fix any hard failures it reports.references/script-fallbacks.md and perform the equivalent work manually, stating explicitly in the handoff what the script would have done and what you did instead. "Genuine fallback" is not an escape hatch; it is a documented manual equivalent.When this skill is triggered, follow this sequence.
Step 0 — Define Success Criteria (evaluations-first). Before starting research, write down:
Return this block to the user before proceeding to step 1. It is cheap, it prevents scope drift, and it becomes the target the obligation audit will check against.
Step 1. Determine the operating mode from the user's request and the current state of context/references/.
Step 2. Read the project's context/ folder first, especially files that already touch the topic or subsystem.
Step 3. Identify the smallest set of code paths needed to verify current implementation reality.
Step 4. Inspect those files before forming project-specific claims.
Step 4.25 / 4.5 — Checkpoints. At the boundary between project grounding and external research, run the pattern-break checkpoint (for long runs) and the motivated-reasoning check defined in references/project-grounding-workflow.md. Pre-commit in writing to the next external-research action.
Step 5. Perform external research through real tool calls, satisfying the floor in "External Research Floor" above:
WebSearch for topic-specific queries (at least 3 distinct),WebFetch for primary sources (at least 3 distinct, across ≥2 source classes),Step 6. Decide whether the result belongs in one paper, a topic folder, or an update to existing research.
Step 7. Run scripts/init_research_artifact.py to scaffold the output location.
Step 8. Write or update the artefact using the section patterns that best fit the topic, populating the External Research Trail, Pre-Completion Obligation Audit, and What I Did Not Do sections that the scaffold creates.
Step 9. Cross-reference related material already in context/references/ when the relationship is meaningful.
Step 10. Run scripts/validate_research_artifact.py.
Step 11. Fix any hard failures and review warnings with judgment.
Step 12 — Adversarial sweep. Before presenting, adopt a skeptical-reviewer persona per references/motivated-reasoning-defence.md. Ask: "A reviewer who suspects I skipped the hard parts — what would they look for?" Perform the tool calls that would satisfy the reviewer, not merely the acknowledgement of their gaps.
Step 13. Present the created or updated research artefact, its placement, the completion report (see below), and any remaining uncertainties.
The message that accompanies the artefact must include a Pre-Completion Obligation Audit block that cites each obligation by tool name plus specific query or URL, matching the audit table inside the artefact. Minimum contents:
WebSearch query run, in order, with the tool name,WebFetch URL retrieved, in order, with the tool name and source class,scripts/init_research_artifact.py and scripts/validate_research_artifact.py (or, if fallback was used, the manual steps that replaced each),Silent completion is not permitted. If the floor is unmet, say so under "Research Floor Unmet — Reasons" and do not present the artefact as complete.
Write like a technically serious engineer-teacher, not like a dry journal abstract and not like a casual blog post.
The target style is:
The target artefact should read like a durable synthesis paper for the repository, not like notes from a search session.
Output is rich, expressive, and analytical — tables for comparisons and gap analyses, matrices for multi-dimensional analysis, trees for topic-folder decomposition, ASCII diagrams for flows and architectures, ASCII visualisations for magnitudes and distributions, short template blocks when a structure communicates faster than prose. The full expressive range of markdown is available; reach for whatever sharpens the analysis. See references/formatting-patterns.md for the specific patterns this skill uses.
Supportive multi-format presentation inside one artefact is encouraged when it helps scanning and reasoning — present the same insight in two forms when each form adds something the other does not. Canonical ownership across artefacts is not duplicated: if one file is the home for a stable topic, related files should reference it rather than re-explaining it.
Quality is checked in two halves. The first half is observable from the transcript and artefact; every item must be verifiable by a reviewer without trusting the author's self-assessment. The second half is self-assessed and secondary — useful, but not a substitute for the observable items.
WebSearch called at least 3 distinct times, with the exact query strings listed in the artefact's External Research Trail and in the completion report.WebFetch called at least 3 distinct times against primary sources, URLs listed in the External Research Trail and in the completion report, covering at least 2 source classes.context/ files were read before project-specific claims; those paths are listed in the obligation audit.scripts/init_research_artifact.py was run; its stdout is captured in the completion report (or the script-fallbacks.md manual equivalent is documented).scripts/validate_research_artifact.py was run; its stdout is captured in the completion report and any hard failures were fixed (or the fallback equivalent is documented).context/references/ and uses a stable topic name.The observable half is the floor. The self-assessed half becomes meaningful once the observable half is satisfied — not before.