Build and maintain a persistent markdown wiki that an LLM updates on the user's behalf, usually inside an Obsidian vault or git-tracked notes repo. Use when raw sources such as web articles, papers, meeting notes, transcripts, screenshots, or past analyses need to be turned into an interlinked knowledge base with immutable source files, LLM-written wiki pages, `index.md`, `log.md`, schema rules in `AGENTS.md` or `CLAUDE.md`, source summaries, query notes, and recurring lint passes. Triggers on: llm-wiki, personal wiki, obsidian wiki, research vault, knowledge base, source ingest, persistent notes, wiki maintenance, source summaries, query filing.
Keyword:
llm-wiki·obsidian wiki·research vault·knowledge baseUse this skill when the user wants knowledge to accumulate as a maintained markdown artifact, not be rediscovered from raw files on every query.
llm-wiki turns Andrej Karpathy's gist into an operational workflow. The core pattern is simple: keep raw sources immutable, let the LLM own the wiki layer, and encode the maintenance contract in AGENTS.md or CLAUDE.md.
index.md and log.md as navigational primitives before adding heavier search infrastructureCreate the wiki skeleton first:
bash scripts/bootstrap-vault.sh /path/to/vault
The bootstrap creates:
raw/sources/ for immutable source markdown or copied filesraw/assets/ for downloaded images and attachmentswiki/sources/ for per-source summarieswiki/entities/ and wiki/concepts/ for durable synthesis pageswiki/queries/ and wiki/reports/ for filed answers and higher-value outputsindex.md, log.md, and AGENTS.mdDo not skip the schema file. The schema is what makes the agent act like a disciplined maintainer instead of a generic assistant. See references/architecture.md and references/schema-playbook.md.
AGENTS.md or CLAUDE.md as the operating contractThe schema should encode the rules that stay true across sessions:
raw/ is source of truth and must stay immutablewiki/ plus index.md and log.md are LLM-owned working artifactsindex.md, and log.mdwiki/queries/ or wiki/reports/Keep the schema short and enforceable. A small contract that the agent actually follows is better than a giant policy file nobody will reread. The starter schema from bootstrap-vault.sh is intentionally minimal; refine it with references/schema-playbook.md.
If the source is already local, place it in raw/sources/ and ask the agent to process it. If the source is a URL, use the Scrapling-powered helper:
bash scripts/ingest-url.sh /path/to/vault "https://example.com/article"
bash scripts/ingest-url.sh /path/to/vault "https://app.example.com/post" --mode fetch --wait-selector article
bash scripts/ingest-url.sh /path/to/vault "https://protected.example.com/post" --mode stealth --solve-cloudflare
Expected ingest touch points:
raw/sources/wiki/sources/index.mdlog.mdPrefer one-source-at-a-time ingest when starting. It forces the human to inspect what the wiki changed, surface missing conventions, and refine the schema. Once the workflow is reliable, batching is fine. Use references/ingest-playbook.md for the operating checklist.
When answering questions:
index.md firstCreate a reusable note stub for high-value answers:
bash scripts/new-query-note.sh /path/to/vault "How does Company A differ from Company B?" --question "How does Company A differ from Company B?"
bash scripts/new-query-note.sh /path/to/vault "Q2 product thesis" --section reports --citation "[[wiki/concepts/product-thesis]]"
Use wiki/queries/ for question-shaped outputs and wiki/reports/ for more durable synthesized artifacts such as memos, comparisons, or presentation backbones. More detail lives in references/query-and-filing.md.
Run the local health check:
python3 scripts/lint-wiki.py /path/to/vault
python3 scripts/lint-wiki.py /path/to/vault --format json
This script focuses on structure, not truth. It checks required files and directories, broken wiki links, and orphan pages. Use the lint output as the starting point for a human-guided cleanup pass:
See references/maintenance-and-scaling.md for the higher-level lint checklist.
index.md plus log.md is enough for small-to-medium vaults. Add heavier tools later:
Do not force embeddings, MCP, or browser automation on day one. The point of this workflow is that a simple markdown repo already compounds knowledge surprisingly well.
bash scripts/bootstrap-vault.sh ~/vaults/company-research
bash scripts/ingest-url.sh ~/vaults/company-research "https://example.com/article"
bash scripts/ingest-url.sh ~/vaults/company-research "https://app.example.com/dashboard" --mode fetch --network-idle
bash scripts/new-query-note.sh ~/vaults/company-research "Why this market is consolidating" \
--question "Why is this market consolidating?" \
--citation "[[wiki/concepts/market-structure]]"
bash scripts/new-query-note.sh ~/vaults/company-research "Q3 diligence memo" --section reports
python3 scripts/lint-wiki.py ~/vaults/company-research
raw/ immutable. Corrections belong in wiki pages or follow-up source notes, not in rewritten raw captures.index.md and log.md on every ingest, query filing, and lint pass. If these drift, the whole workflow gets harder to navigate.index.md before doing expensive retrieval work. It is the simplest useful search system for moderate vault sizes.AGENTS.md when you notice repeated drift or ambiguity.