Manages 10,000+ document corpora (PDF, DOCX, images). Builds hierarchical indexes, performs incremental updates, and routes queries through structured summaries before opening files. Use when the user asks to find, analyze, compare, index, or extract information from dossiers or documents in a corpus. Do NOT activate for general file operations unrelated to dossier management.
Unified corpus management skill: index, update, and search document corpora of 10,000+ mixed-type files with 65-90% token reduction via progressive disclosure.
Determine the user's intent from their message:
| Intent | Trigger phrases | Action |
|---|---|---|
| BUILD | "index", "reindex", "rebuild index", "indexer", "scanner le corpus" | Go to Path A |
| UPDATE | "update index", "mise a jour", "sync", "new files added" | Go to Path B |
| SEARCH | Any dossier/document query (default) | Go to Path C |
If the user provides a corpus path explicitly, use it. Otherwise check if
the current working directory contains dossier-index.json and use its parent.
Run this when the user wants to create or rebuild the full index.
python ${SKILL_DIR}/scripts/build_index.py "<corpus_path>"
errors > 0, list which file types failedRun this when the user wants to sync the index with filesystem changes.
python ${SKILL_DIR}/scripts/update_index.py "<corpus_path>"
Follow this sequence strictly for every document query. Never skip to file reads.
python ${SKILL_DIR}/scripts/search_index.py "<user_query>" --corpus "<corpus_path>" --limit 10
If the user mentions a specific category, add --category <name>.
Review the ranked dossier list and their 3-sentence summaries.
Summary sufficient: Answer the user's question directly from the index results. Zero file reads. This is the preferred outcome.
Need 1-3 specific dossiers: Open ONLY those files with the Read tool for detailed analysis. Never open more than 3 files in a single query.
Cross-corpus reasoning required (comparing 5+ dossiers across categories): Spawn one subagent per relevant category. Each subagent receives only its category's search results. Main session merges the structured summaries from each subagent.
Present findings to the user. Always cite the dossier name and category. If answering from summaries, note that a deeper analysis is available if needed.
resources/index-schema.mdresources/query-router.mdresources/ux-messages.md