Index repository for deep codebase understanding
Build comprehensive codebase understanding through systematic analysis. This runs in the background when the agent is idle, maintaining an up-to-date picture of the repository structure, dependencies, and patterns.
This is a builtin skill (handler: type: builtin). When index_repo is invoked, the executor runs analysis scripts against the repository and writes structured results to the agent's memory directory. The tool schema in tool.yaml defines the external contract; the executor handles the scanning and output directly.
Interruptibility: Indexing is low-priority. If a higher-priority task arrives (e.g., review request), indexing pauses and resumes when idle.
Before indexing, check if the existing index is still valid:
.last_indexed timestamp from agent memoryScan all source files and record:
Exclude configured paths (.venv, node_modules, .git, __pycache__).
For each source file, extract dependencies:
import and from ... import statementsimport and require statementsRecord which files depend on which, enabling impact analysis.
For each implementation file, find its corresponding test file:
test_{name}.py, {name}_test.py, {name}.test.ts)src/foo.py → tests/test_foo.py)missing_testScan for common patterns and potential problems:
Output structured files to agent memory:
| File | Content |
|---|---|
file_inventory.yaml | All source files with metadata |
import_graph.yaml | Dependency relationships |
test_mappings.yaml | Implementation-to-test file mapping |
detected_patterns.yaml | Patterns and potential issues |
index_summary.md | Human-readable summary |
.last_indexed | Timestamp for freshness checking |
## Repository Index
### Statistics
- Source files: [count]
- Languages: [list]
- Test files: [count]
- Files missing tests: [count]
- Async functions: [count]
### Patterns Detected
- [pattern]: [count] occurrences
- [potential issue]: [count] locations
### Test Coverage Gaps
- [file]: missing test
- [file]: missing test
### Index Files Written
- [path]: [description]
.venv or node_modules entries)node_modules, .venv, or __pycache__ in the index creates enormous, useless files.test_{name}.py when the project uses {name}_test.py or a different convention. Check actual test files first..last_indexed means the next session will re-index unnecessarily.Repository indexing is complete when: