Analyze a codebase and produce an educational brief covering architecture, technology choices, design patterns, and gaps. Use when the user asks to "educate me on this codebase", "analyze this project for learning", "explain the architecture", "what can I learn from this code", or "codebase educator".
Analyze a source and produce a structured educational brief as an Obsidian-compatible
vault. The vault lives on the Windows filesystem at /mnt/c/Users/horse/Obsidian/educator-briefs/
and is symlinked to ~/.claude/educator-briefs/. Always use the symlink path for
file operations -- the safety hook allowlists the resolved path.
Arguments: $ARGUMENTS
This skill uses a state machine for restartability and disk registers for
context efficiency. Registers are YAML files in /tmp/educator-<name>/ that
replace implicit context-window memory with explicit, shareable data structures.
INIT -> GATHERING -> GATHERED -> ASSESSED -> WRITING -> SECTIONS_DONE
-> SWEPT -> CONCEPTS_DONE -> COMMITTED -> COMPLETE
Transition rules and the full schema are in references/registers/state.md.
Write the state file after every transition. On resume, read the state file
and skip completed phases.
| Register | Written by | Purpose | Schema |
|---|---|---|---|
state.yaml | Orchestrator | Workflow progress + restartability | references/registers/state.md |
gather.yaml | Phase 1 | Structured codebase data | references/registers/gather.md |
url-index.yaml | Phase 1 | Technology link lookup table | references/registers/url-index.md |
quality.yaml | Phase 1.5 | Quality assessment + tone guidance | references/registers/quality.md |
sections.yaml | Phase 2 | Per-section metadata (concepts, URLs, counts) | references/registers/sections-manifest.md |
All registers live in /tmp/educator-<name>/. Read register schemas on first
use, not upfront -- only load the schema you need for the current phase.
Load only what the current phase needs:
| Phase | Load these references | Load these registers |
|---|---|---|
| 1 (Gather) | nothing extra | -- (creating them) |
| 1.5 (Quality + Depth) | quality-assessment.md | gather.yaml |
| 2 (Write) | sections/_shared.md + the ONE section template | gather.yaml, url-index.yaml, quality.yaml |
| 2.5 (Sweep) | nothing extra | sections.yaml |
| 3 (Concepts) | concept-template.md | sections.yaml, registry + connections + vault-state on disk |
| 4-5 (Commit/Report) | nothing extra | state.yaml, sections.yaml |
Never load all 13 section templates at once. Each section writer loads
references/sections/_shared.md (shared rules) + its own template file only.
$ARGUMENTS may contain one or more sources, space-separated. Parse each
token using:
| Input | Type | Method |
|---|---|---|
| No argument | Local project | Analyze current working directory |
/path, ./path, ~/path | Local path | Analyze directory at that path |
https://github.com/... | GitHub repo | git clone --depth 1 to /tmp/educator-<name> |
https://... (non-GitHub) | Website | Discover repo first (see Website Sources below), fall back to external observation |
npm:<package> | npm package | npm pack to /tmp/educator-<name>, extract |
pypi:<package> | PyPI package | pip download --no-deps --no-binary :all: to /tmp/educator-<name>, extract |
Resolve the base URL once during source detection:
| Source type | Base URL | File link pattern |
|---|---|---|
| GitHub repo | Input URL | <base>/blob/<branch>/<filepath> |
| Local with GitHub remote | git remote get-url origin converted | Same blob pattern |
npm/PyPI with repository | From package manifest | Same if GitHub |
| Local, no remote | null | Relative paths only |
When multiple sources are provided:
| Phase | Scope |
|---|---|
| 1, 1.5, 2, 2.5 | Per source sequentially |
| 3 (Concepts) | Once across all sources |
| 4 (Commit) | Once -- single branch, single commit |
| 5 (Report) | Once -- combined report |
Clean up each source's /tmp/ directory after its Phase 2 completes.
Build a shared URL index across sources -- check existing entries before
making new lookups.
State transition: INIT -> GATHERING -> GATHERED
Resolve source -- detect type, acquire code if needed
Scan structure -- ls top-level, read key files:
Map dependency graph -- read imports in key files, trace module structure
Sample depth -- read files from different codebase areas:
| Project size | Sample target | Strategy |
|---|---|---|
| Small (<20 files) | Read most files | Near-complete coverage |
| Medium (20-100) | 8-15 files | Cover every layer/module |
| Large (100+) | 15-25 files | Glob to discover, sample each subsystem |
Batch Read calls. Prioritize: hub modules, architectural boundaries, unusual files. For each file, write a structured summary (not raw content) into the gather register, noting key snippets by line range.
Read test files -- 2-3 real test files for core logic (not trivial utils)
Check history (if git) -- git log --oneline -20 for activity;
git log --diff-filter=A --name-only --format="" | head -30 for file creation order
Build URL index -- For every significant technology, resolve:
WebFetch budget: Only verify URLs when docs site isn't obvious from the name or registry metadata. Never WebFetch well-known tech URLs. When uncertain, link to registry page.
Minimum coverage: language, framework, database, major libraries (3+ imports), build tools, test framework.
Write registers:
gather.yaml following references/registers/gather.md schemaurl-index.yaml following references/registers/url-index.md schemastate.yaml: state -> GATHEREDState transition: GATHERED -> ASSESSED
Load references/quality-assessment.md for the full rubric.
gather.yaml for codebase observationsdeep, standard, light, or skip. See "Depth Profiling" and
"Section Skip Rules" in references/quality-assessment.md for criteria.quality.yaml following references/registers/quality.md schema
(each section-note now includes both note and depth)state.yaml: state -> ASSESSEDState transition: ASSESSED -> WRITING -> SECTIONS_DONE
If ~/.claude/educator-briefs/_index.md doesn't exist, create the vault
structure. See references/vault-bootstrap.md for templates:
_index.md (Map of Content)_concepts/ directory_concepts/_registry.yaml (empty with header comments, v2 enriched schema)_concepts/_vault-state.yaml (empty with header comments)_concepts/_connections.yaml (empty with header comments)If the project subfolder already exists, ask the user whether to overwrite or create a timestamped version.
For each section, the writer:
references/sections/_shared.md (shared rules)references/sections/<section>.md (that section's template only)gather.yaml, url-index.yaml, quality.yamlsampled-files in gather register)~/.claude/educator-briefs/<project-name>/sections.yaml (concepts linked, URLs used, counts)state.yaml section status: doneTrack A (implementation-heavy -- these need more source file reads):
architecture.mddesign-patterns.mdkey-decisions.mdtesting-strategy.mdgaps-vulnerabilities.mdTrack B (analytical -- these draw more from registers than raw source):
technology-choices.md (can start after architecture.md completes)dependencies.mdevolution.mdif-starting-over.mdlearning-path.mdglossary.mdSequencing rule: architecture.md is always written first (it
establishes structural vocabulary). After that, both tracks can proceed.
The following are always last, in order:
resources.md (collects from all other sections -- read sections.yaml)_<project-name>_overview.md (summarizes everything)Parallel dispatch (when using Agent tool):
If dispatching agents for Track A and Track B:
sections.yaml (append-only, no conflicts)Bash with heredoc for new file writes
(background agents may get Write denied for vault paths)resources.md and overview.md lastIf running single-threaded, write in this order:
architecture -> technology-choices -> design-patterns ->
key-decisions -> dependencies -> evolution -> testing-strategy ->
gaps-vulnerabilities -> if-starting-over -> learning-path ->
glossary -> resources -> overview
After writing all sections, validate every Mermaid diagram:
graph TD, graph LR, etc.)subgraph has a matching endA["Label (parens)"])references/diagram-guide.md for syntax patterns.After all sections are written, verify:
technology-choices.md stack table has clickable links in every rowdependencies.md has registry + docs links for every dependencyglossary.md has docs links for framework-specific jargongaps-vulnerabilities.md links to guides for addressing each gapresources.md contains every unique URL from all sectionsUpdate state.yaml: state -> SECTIONS_DONE
State transition: SECTIONS_DONE -> SWEPT
Scan for missed concept wikilink opportunities.
Read sections.yaml to see what's already linked
Grep section files for ## and ### headers
Apply the transferability test to each header naming a technique/pattern:
Would this concept page make sense with zero projects in "Seen In"? Can you write "How It Works" and "Trade-Offs" without referencing any project?
Yes -> it's a concept, add [[concept-name]] wikilink near the header.
No -> it's a project detail, leave it.
Priority targets: named patterns in design-patterns.md, architectural styles in architecture.md, named practices in testing-strategy.md, anti-patterns, missing patterns
Add missing wikilinks inline. Update sections.yaml with new concepts.
Typical yield: 2-5 new concepts. If 15+, threshold is too low.
Update state.yaml: state -> SWEPT, concept-sweep-done -> true
State transition: SWEPT -> CONCEPTS_DONE
Load references/concept-template.md for the concept page template.
Uses _concepts/_registry.yaml with enriched entries that include both
category and projects:
strategy-pattern:
category: pattern
projects:
- expressjs--express
- pallets--flask
If the registry doesn't exist (older vault), build it by scanning
_concepts/*.md for "## Seen In" entries and frontmatter category.
If the registry exists but uses the v1 flat format (concept -> list),
migrate in-place: read each concept page's frontmatter for category,
rewrite as v2.
Phase 3 also reads and updates two vault-level metadata files:
_concepts/_connections.yaml — Pre-computed project-to-project concept
overlaps. Read at start of step 4 to skip discovery; updated with new
project's connections at end of step 4._concepts/_vault-state.yaml — Per-project tracking. Updated after
commit (Phase 4) with last-educator-run and concept-count.If either file is missing, create it from references/vault-bootstrap.md.
Load registry into memory
Collect concept list from sections.yaml -- union of all
concepts-linked across sections. Deduplicate.
Process concepts -- for each:
references/concept-template.md,
add to registry with category (from frontmatter) and this project
as first entryprojects listBacklink format: [[<project>/_<project>_overview|<project>]]
(bare [[project]] resolves to nothing -- it's a folder).
Batch writes: Group new concepts and write in quick succession. Batch Edit calls for existing concept backlinks.
Cross-project connections -- Use the connection index + registry:
_connections.yaml for existing connection dataprojects list contains both
the new project and at least one other project_connections.yaml: add new project's entries and add the
new project to each connected project's entry (both directions)Concepts shared with [[other/_other_overview|Display Name]]:
- **Concept** -- how this project uses it; how the other uses it.
Write registry back to disk (v2 enriched format)
Update _index.md -- add project row, update Concepts by Category
(use registry category field rather than reading concept frontmatter)
Quick link checks:
Update state.yaml: state -> CONCEPTS_DONE, populate concepts-created/updated/connections
State transition: CONCEPTS_DONE -> COMMITTED
cd ~/.claude/educator-briefs_vault-state.yaml — set this project's entry:
last-educator-run: current ISO timestampconcept-count: number of concepts linked to this project (from registry)last-audit: leave unchanged (or null for new projects)git checkout -b brief/<project-name>
(multi-source: brief/batch-YYYY-MM-DD)git add <project>/ _concepts/ _index.md
Include any existing project overviews updated with reciprocal connections.
The _concepts/ glob catches registry, connections, and vault-state.git commit -m "feat(<project>): add educational brief"git push -u origin brief/<project-name>gh pr create --title "feat(<project>): add educational brief" --body "..." --fillgh pr merge --squash --delete-branchgit checkout main && git pullstate.yaml: state -> COMMITTED, populate branch/sha/prState transition: COMMITTED -> COMPLETE
Present to the user:
~/.claude/educator-briefs/)/educator-audit <project-name> to
validate links, registry consistency, and section quality. The /tmp/
registers are still available for debugging if issues are found."Update state.yaml: state -> COMPLETE
Do NOT clean up /tmp/educator-<name>/ automatically after Phase 5.
The registers in /tmp/ hold structured state (gather, quality, sections)
that is useful for debugging if the follow-up audit discovers issues.
/educator-audit (or explicitly declines), then
clean up: rm -rf /tmp/educator-<name> and check exit code/tmp/educator-<name>/ still exists and will be cleaned on reboot/tmp/ without the user knowingUse one WebFetch call on the main URL. Look for GitHub/GitLab/Bitbucket links in
footer, header, meta tags. Try https://github.com/<domain-parts> for obvious cases.
Clone as GitHub source. Run full analysis. Note discovery in overview:
**Source:** [owner/repo](url) *(discovered via [domain](url))*
Use WebFetch on main URL + 2-3 subpages. Analyze observable tech choices.
Skip evolution.md and testing-strategy.md. Mark as "external observation only."
Omit code examples and relevant files headers. Compensate with deeper analytical
prose, Mermaid diagrams, and external resource links.
my-app)owner--repo (e.g., expressjs--express)stripe.com)fastify)If subfolder exists, ask user: overwrite or timestamped version?
<project-name>/
+-- _<project-name>_overview.md
+-- architecture.md
+-- technology-choices.md
+-- design-patterns.md
+-- key-decisions.md
+-- gaps-vulnerabilities.md
+-- dependencies.md
+-- evolution.md
+-- testing-strategy.md
+-- if-starting-over.md
+-- learning-path.md
+-- glossary.md
+-- resources.md
On invocation, before source detection:
/tmp/educator-*/state.yaml filesCOMPLETE, offer to resume:
"Found in-progress analysis for <source> (state: <state>). Resume?"_<project-name>_overview.md -- never _overview.md or overview.md.dependency-injection.md not
Dependency-Injection.md. #1 cause of orphaned links.Write may be denied for
vault paths. Use cat > "$file" << 'CONTENT' as primary write method.
Edit works for modifications to existing files.