Trove collection and normalization for swain-design artifacts. Collects sources from the web, local files, and media (video/audio), normalizes them to markdown, and caches them in reusable troves. Use when researching a topic for a spike, ADR, vision, or any artifact that needs structured research. Also use to refresh stale troves or extend existing ones with new sources. Triggers on: 'research X', 'gather sources for', 'compile research on', 'search for sources about', 'refresh the trove', 'find existing research on X', or when swain-design needs research inputs for a spike or ADR.
Collect, normalize, and cache source materials into reusable troves that swain-design artifacts can reference.
| Signal | Mode |
|---|---|
| No trove exists for the topic, or user says "research X" / "gather sources" | Create — new trove |
| Trove exists and user provides new sources or says "add to" / "extend" | Extend — add sources to existing trove |
| Trove exists and user says "refresh" or sources are past TTL | Refresh — re-fetch stale sources |
| User asks "what troves do we have" or "find sources about X" | Discover — search existing troves by tag |
Build a new trove from scratch.
Ask the user (or infer from context) for:
websocket-vs-ssereal-time, websocket, sse)If invoked from swain-design (e.g., spike entering Active), the artifact context provides the topic, tags, and sometimes initial sources.
For each source, use the appropriate capability. Read skills/swain-search/references/normalization-formats.md for the exact markdown structure per source type.
Web search queries:
Web page URLs:
failed: true flag and move onVideo/audio URLs:
Local files:
Forum threads / discussions:
Repositories:
sources/<source-id>/selective: true in the manifest entryhighlights array with paths to the most important files (relative to the source-id directory)Documentation sites:
sources/<source-id>/selective: truehighlights array with paths to the most important pagesEach normalized source gets a slug-based source ID and lives in a directory-per-source layout:
sources/<source-id>/<source-id>.mdsources/<source-id>/ with the original tree mirrored insideSource ID generation:
mdn-websocket-api, strangeloop-2025-realtime)__word1-word2 using two random words from skills/swain-search/references/wordlist.txt__ followed by 4 hex characters (e.g., __a3f8) as a fallbackCreate manifest.yaml following the schema in skills/swain-search/references/manifest-schema.md. Include:
Compute content hashes as bare hex SHA-256 digests (no prefix) of the normalized markdown content:
shasum -a 256 sources/mdn-websocket-api/mdn-websocket-api.md | cut -d' ' -f1
Create synthesis.md — a structured distillation of key findings across all sources.
Structure the synthesis by theme, not by source. Group related findings together, cite sources by ID, and surface:
Keep it concise. The synthesis is a starting point, not a comprehensive report — the user or artifact author will refine it.
Use the dual-commit pattern (same as swain-design lifecycle stamps) to give the trove a reachable commit hash.
Before Commit A — append a history entry to manifest.yaml with a -- placeholder for the commit hash: