A CLI agent skill based on Karpathy's LLM Wiki — Create and maintain a persistent, interconnected Markdown knowledge base—ingesting sources, enabling queries over compiled knowledge, and ensuring consistency through linting.
Continuously update and grow a persistent knowledge base composed of interlinked markdown files. Raw contents are curated by humans and the agent uses this skill to collect, dedupe, cross-reference and summarize raw contents into structured markdown files. This skill is activated when the user wants:
Below is a sample wiki directory structure:
llm-wiki/
├── SCHEMA.md # Layer 3: A document for User and LLM to co-evolve the wiki conventions, structure of wiki and tag taxonomy
├── index.md # Always exists, regardless of SCHEMA.md definition. Catalog of everything, organizaed by categoies (entities, concepts etc), each page listed with a link and a one-line summary
├── log.md # Always exists, regardless of SCHEMA.md definition. Chronological action log (append-only, rotated yearly)
├── raw/ # Layer 1: Always exists, regardless of SCHEMA.md definition. Immutable content curated by humans
│ ├── documents/ # Web articles, clippings, PDFs
│ └── assets/ # Images, diagrams referenced by sources
├── generated/ # Layer 2: Always exists, regardless of SCHEMA.md definition. LLM-generated directories and markdown files
│ ├── entities/ # Always exists, regardless of SCHEMA.md definition. Entity pages (people, orgs, products, models)
│ ├── topics/ # Always exists, regardless of SCHEMA.md definition. Topic pages (concepts, terms)
│ ├── comparisons/ # Side-by-side analysis (between entities or between topics)
3 layers will be explained in next secion. The Wiki or the knowledge base is built using above structure, user can ONLY apply this skill when the CLI agent (Codex, Claude Code, Gemini etc) is invoked at the llm-wiki/ folder root level given that agents have scoped file-system permissions, that means there should be SCHEMA.md, index.md, log.md and raw/, generated/ folders underneath. If user invokes CLI agent anywhere else inside the wiki subfolders this skill will abort.
The index is sectioned by type. Each entry is one line: wikilink + summary.
# Wiki Index
> Format: `## Last Updated: [YYYY-MM-DDThh:mm:ss] | subject | Total pages: N`
> Subject: the term (entity/topic/comparison etc)
> Total pages: how many pages the term is brought up
## Entities
<!-- Alphabetical within section -->
## Topics
## Comparisons
# Log File
> Format: `## [YYYY-MM-DDThh:mm:ss] action | subject | files`
> Actions: ingest, update, lint, archive, delete
> Subject: the summary of what happened within 300 characters
> Files: related files such as raw documents locations, generated wikis
> When log.md exceeds 500 entries, rotate: rename to log-YYYY.md, start fresh.
Adapt to the user's preference. The schema constrains agent behavior and ensures consistency:
# Wiki Schema
## Conventions
- Raw files can be broken down to several markdown files that live in `entities/`, `topics/` etc.
- Only create the wiki pages when an entity/topic is mentioned in 2+ sources or is central to one source.
- File names: lowercase, hyphens, no spaces (e.g., `transformer-architecture.md`)
- Every wiki page starts with YAML frontmatter (see below)
- Use `[[wikilinks]]` to cross-link between pages (minimum 2 outbound links per page)
- Every new or updated page must link to at least 2 other pages via `[[wikilinks]]`.
- When updating a page, always bump the `updated` date
- When new information conflicts with existing content:
1. Check the dates — newer sources generally supersede older ones
2. If genuinely contradictory, note both positions with dates and sources
3. Mark the contradiction in frontmatter: `contradictions: [page-name]`
4. Flag for user review
## Frontmatter
> ---
> title: Page Title
> created: YYYY-MM-DDThh:mm:ss
> updated: YYYY-MM-DDThh:mm:ss
> type: entity | topic | comparison
> sources: [raw/documents/source-name.md]
> ---
## Layer 1 (User can specify, otherwise default to this file)
### Documents
Any single document in different formats user put in
### Assets
Any media files, images, video links etc.
## Layer 2 (User can specify, otherwise default to this file)
### Entities
One markdown page per notable entity. Include:
- Overview / what it is
- Key facts and dates
- Relationships to other entities ([[wikilinks]])
- Source references
### Topics
One markdown page per concept or topic. Include:
- Definition / explanation
- Current state of knowledge
- Open questions or debates
- Related concepts ([[wikilinks]])
### Comparison Pages
Side-by-side analysis in markdown. Include:
- What is being compared and why
- Dimensions of comparison (table format preferred)
- Verdict or synthesis
- Sources
Three Layers
Layer 1 — Raw Contents: Immutable directory and files. The agent can read but can never modify them.
Layer 2 — The Wiki or knowledge base: Agent-owned directories and markdown files. Created, updated, and
cross-referenced by the agent.
Layer 3 — The Schema: SCHEMA.md defines user preferences of llm-wiki/ conventions, and tag taxonomy.
SCHEMA.md with user preferences based on 2 and 3.index.md based on index.md template with sectioned header.log.md based on log.md template with creation entry.raw/ using the code block below, the output is a list of each file metadata with mtime.python3 $HOME/.agents/skills/llm-wiki-skill/scripts/ingest.py --collect ./raw
log.md to retrieve the "date time", use the below code block to retrieve its mtimepython3 $HOME/.agents/skills/llm-wiki-skill/scripts/ingest.py --iso-to-mtime "date time"
then find files generated in step 1 that are added after the "date time".
3. Based on SCHEMA.md convetions section, summarize each newly added file.
4. Find files that are added before and on the date time to prepare for step 4 and 5.
5. For file content conflicts, refer to conventions section defined in SCHEMA.md.
6. New files are created, existing files are updated, all with newly added cross-links under generated/.
7. Update index.md with sectioned header.
8. Update log.md with creation entry.
9. List out all files that are newly added or changed for user.
index.md.SCHEMA.md using the format like "Based on [[page-1]], [[page-2]], ...".queries/ folder if that does not existcomparisons/, otherwise put in queries/index.md with sectioned header.log.md with creation entry.Note: Below are 4 rules we lint, after they are checked, add a entry in log.md with lint action, subject should be "Rules passed: N, rules failed: M, which rule failed and a brief summary within 200 characters".
SCHEMA.md first, find any violations and unresolved contradictions in the wiki.python3 $HOME/.agents/skills/llm-wiki-skill/scripts/links.py --orphan ./generated
python3 $HOME/.agents/skills/llm-wiki-skill/scripts/links.py --broken ./generated
generated/ should be listed in index.md, flag any pages that are not listed.|-------Your request has been taken care of!---------|
raw/ — sources are immutable. Corrections go in wiki pages.log-YYYY.md and start fresh.
The agent should check log size during lint.