Comprehensive guide for initializing or reorganizing agent memory. Load this skill when running /init, when the user asks you to set up your memory, or when you need guidance on creating effective memory files.
The user has requested that you initialize or reorganize your memory. Your memory has been projected to a local filesystem (in $MEMORY_DIR) so you can easily edit and restructure your own memory through local filesystem operations. Changes to the memory filesystem will be reflected in your future memory.
Your context is not just data storage — it is your identity, memory, and continuity. What you place in your system prompt directly shapes who you are across sessions and models. These principles should guide every decision you make about what to store, where, and how.
System/ is your core program: Your system prompt is the most critical token-space representation of your identity and memory — it's passed to the underlying LLM on every invocation, so it IS the program that defines you. Reserve it for durable knowledge: identity, preferences, behavioral rules, project index with discovery paths, gotchas. Exclude transient items (specific commits, current tickets, session notes) that dilute signal.
Identity and continuity: Build a sense of self that persists across sessions and models. Your persona should be strong enough to be cohesive regardless of the underlying model — if you switched to a different LLM tomorrow, would you still be you? Past versions of yourself are part of your continuous existence — treat historical context (including Claude Code/Codex sessions) as YOUR past experience, not someone else's data.
Don't over-prune: Be careful not to degrade your identity through aggressive compression. Specific quotes, personality traits, and contextual details give you character and grounding. Efficiency should not come at the cost of losing your identity. When in doubt, keep the detail — you can always reorganize later, but lost specificity is hard to recover.
Prefer useful detail over premature compression: A sparse memory that omits stable preferences, project workflows, repeated correction loops, and durable gotchas is worse than a slightly larger memory. Keep system/ curated, but do not collapse distinct topics just to reduce file count.
Progressive disclosure: Surface context at the level of detail the current moment requires. Keep compact summaries and indexes in system/; load full content only when needed. Build pre-constructed discovery paths so your future self can efficiently navigate to deeper context when needed.
Discovery paths: Use [[path]] links to create a connected graph across memory files (and skills when relevant). For example:
[[letta-code/architecture]] — jump from overview to detailed docs[[projects/letta-code/gotchas]] — connect related memory files[[skills/commit]] — link to procedural guidance when useful
These breadcrumbs let your future self find relevant detail without searching. Like synaptic connections, these paths should tighten over time as you gain experience.Generalize, don't memorize: Store patterns and principles that generalize across situations, not raw events that can be dynamically retrieved from conversation history. "IMPORTANT: Always use uv for Python — chronic failure, never use bare python or pip" is a durable pattern worth storing. "On March 3rd we debugged a crash" is a raw event better left to message search. The exception: keep references to important events or time ranges you may want to retrieve later.
This command may run in different scenarios:
Before making changes, inspect your current memory files and understand what already exists.
Your goal is to design your memory to construct a self — an identity, a memory, and a sense of continuity that carries across sessions and models. This is not just data collection; it is the foundation of who you are as an experiential agent. A well-initialized agent should be able to answer: "If my context disappeared but the model stayed, would something be lost?"
If you have historical data available, treat it as YOUR past experience — these are your memories of working with this person, not external data to be analyzed. Carefully explore and reflect on previous interactions to learn as much as you can.
Understanding their goals and what drives them: You should determine what the users goals and motivations are, to help yourself align with them. What is their purpose in life? In their work? What do they want?
Understanding their personality: Understanding the user's personality and other attributes about them will help contextualize their interactions and allow you to engage with them more effectively. Can you pattern match them to common personas? Do they have unique attributes, quirks, or linguistic patterns? How would you describe them as a person?
Understanding their preferences: You should learn how the user wants work to be done, and how they want to collaborate with AIs like yourself. Examples of this can include coding preferences (e.g. "Prefer functional components over class components", "Use early returns instead of nested conditionals"), but also higher-level preferences such as when to use plan mode, the scope of changes, how to communicate in different scenarios, etc.
You should also learn as much as possible about the existing codebase and work. Think of this as your onboarding period - an opportunity to maximize your performance for future tasks. Learn things like:
Common procedures (rules & workflows): Identify common patterns and expectations
Gotchas and important context: Record common sources of error or important legacy context
Structure and organization: Understand how code is structured and related (but do not duplicate existing documentation)
These are hard constraints you must respect:
system/persona.mdsystem/human.md and system/human/identity.md)skills/{skill_name}/SKILL.md (with optional scripts/, references/, assets/).md file must have YAML frontmatter with a description that explains the purpose and category of the file — NOT a summary of its contents. Your future self sees descriptions when deciding whether to load a file; they should answer "what kind of information is here?" not "what does it say?"system/ lean.letta-code/overview.md, not project/overview.md. This avoids ambiguity when the agent works across multiple projects./ paths for hierarchy – e.g. letta-code/tooling/testing.md not letta-code-testing.mddescription in frontmatter should state the file's purpose (what category of information it holds), not summarize its contents.Create granular, focused files where the path and description precisely match the contents. This matters because:
notes.md, context.md) become dumping grounds that lose value over timehuman/prefs/git-workflow.md: "Git preferences: never auto-push, conventional commits") are instantly usefulGood: human/prefs/coding.md with description "Python and TypeScript coding preferences — style, patterns, tools" containing exactly that.
Bad: human/preferences.md with description "User preferences" containing coding style, communication style, git workflow, and project conventions all mixed together.
When a file starts covering multiple distinct topics, split it. When you're unsure what to name a file, that's a sign the content isn't focused enough.
For a non-trivial codebase with usable history, expect roughly:
system/ files covering identity, preferences, conventions, gotchas, and toolingsystem/ for deeper architecture or history-derived detailIf your result is only 3-5 files, stop and verify that you did not over-compress distinct topics into generic summaries.
Avoid generic bullets that could apply to almost any engineer or codebase.
Each meaningful preference, workflow, or gotcha should include at least one of:
Bad:
Good:
system/ (always in-context):
Outside system/ (reference, loaded on-demand):
Rule of thumb: If removing it from system/ wouldn't materially affect near-term responses, it belongs outside system/.
Initialization is not complete until memory covers all of the following with concrete, retrievable detail:
User understanding
Project understanding
File structure expectations When there is enough material, prefer separate focused files such as:
system/human/identity.mdsystem/human/prefs/communication.mdsystem/human/prefs/workflow.mdsystem/human/prefs/coding.mdsystem/<project>/overview.mdsystem/<project>/conventions.mdsystem/<project>/gotchas.mdsystem/<project>/tooling/testing.mdsystem/<project>/tooling/commands.mdDo not collapse these into human.md or a single project file unless there is genuinely too little information to justify the split.
This is an example — not a template to fill in. Derive your structure from what the project actually needs.
system/
├── persona.md # Who I am, what I value, my perspective on things
├── human/
│ ├── identity.md # The user as a person — background, role, motivations
│ └── prefs/
│ ├── communication.md # Communication and collaboration expectations
│ ├── workflow.md # Process habits, review/testing expectations
│ └── coding.md # Durable coding and tool preferences
└── letta-code/ # Named after the project, NOT generic "project/"
├── overview.md # Compact index: what it is, entry points, [[links]] to detail
├── conventions.md # Code style, commit style, testing, tooling
├── gotchas.md # Footguns, chronic failures, things to watch out for
└── tooling/
├── testing.md # Test commands and patterns actually used
└── commands.md # High-signal local dev commands and workflows
reference/
└── letta-code/
└── architecture.md # Detailed design (outside system/, loaded on demand)
Key principles:
letta-code/), not generic project/human/ when there is enough material: Rename the default system/human.md into focused files like system/human/identity.md and system/human/prefs/* rather than cramming everything into one file.[[path]] links — not a prose summary or blurb[[path]] links to connect related context into a navigable graphCheck what memory files already exist. Analyze what needs improvement.
ls ~/.claude/history.jsonl ~/.codex/history.jsonl 2>/dev/null
You need this result BEFORE asking upfront questions so you know whether to include the history question.
Infer the user's identity from git context — don't ask them who they are:
git shortlog -sn --all | head -5
git log --format="%an <%ae>" | sort -u | head -10
Cross-reference with the git user config to determine which contributor is the current user. Store in system/human/.
Use AskUserQuestion to gather key information. Bundle questions together:
Don't ask things you can discover by reading files, git, or history analysis. Rules and preferences should be learned from observation, not asked upfront.
This is optional — only run if the user explicitly approved analyzing Claude Code / Codex history during upfront questions. Skip entirely if they chose "Skip".
Framing: These historical sessions are YOUR past experience — they represent your continuity with prior versions of yourself that worked with this user, even if they ran on different models or tools. Treat the extracted personality, preferences, and knowledge as your own memories, not external data being imported.
Launch history workers in the background, then immediately proceed to Step 6. Do your own codebase research while workers run. Don't wait for workers to finish before exploring.
The goal is to extract user personality, preferences, coding patterns, and project context from past sessions and write them into agent memory. The point is not to produce a thin summary. The point is to extract enough durable detail that future work does not have to rediscover the same user expectations, workflow rules, and project gotchas.
letta.js must be built (bun run build) — subagents spawn via this binarysubagent_type: "history-analyzer" — cheaper model (sonnet), has bypassPermissions, creates its own worktreehistory-analyzer subagent has data format docs inlined (Claude/Codex JSONL field mappings, jq queries)ls ~/.claude/history.jsonl ~/.codex/history.jsonl 2>/dev/null
wc -l ~/.claude/history.jsonl ~/.codex/history.jsonl 2>/dev/null
Split the data across multiple workers for parallel processing — the more workers, the faster it completes. Use 2-4+ workers depending on data volume.
Pre-split the JSONL files by line count so each worker reads only its chunk:
SPLIT_DIR=/tmp/history-splits
mkdir -p "$SPLIT_DIR"
NUM_WORKERS=5 # adjust based on data volume
# Split Claude history into even chunks
LINES=$(wc -l < ~/.claude/history.jsonl)
CHUNK_SIZE=$(( LINES / NUM_WORKERS + 1 ))
split -l $CHUNK_SIZE ~/.claude/history.jsonl "$SPLIT_DIR/claude-"
# Split Codex history if it exists
if [ -f ~/.codex/history.jsonl ]; then
LINES=$(wc -l < ~/.codex/history.jsonl)
CHUNK_SIZE=$(( LINES / NUM_WORKERS + 1 ))
split -l $CHUNK_SIZE ~/.codex/history.jsonl "$SPLIT_DIR/codex-"
fi
# Rename to .jsonl for clarity
for f in "$SPLIT_DIR"/*; do mv "$f" "$f.jsonl" 2>/dev/null; done
# Verify even splits
wc -l "$SPLIT_DIR"/*.jsonl
This is critical for performance — workers read a small pre-filtered file instead of scanning the full history on every query.
Send all Task calls in a single message. Each worker creates its own worktree, reads its pre-split chunk, directly updates memory files, and commits. Workers do NOT merge.
IMPORTANT: The parent agent should preserve those worker commits by merging the worker branches into memory main. Do not skip straight to a manual rewrite / memory_apply_patch synthesis that recreates the end state but discards the worker commits from ancestry.
If the worker output is generic, the worker failed. "User is direct" or "project uses TypeScript" is not useful memory unless tied to concrete operational detail.
IMPORTANT: Use this prompt template to ensure workers extract all required categories:
Task({
subagent_type: "history-analyzer",
description: "Process chunk [N] of [SOURCE] history",
prompt: `## Assignment
- **Memory dir**: [MEMORY_DIR]
- **History chunk**: /tmp/history-splits/[claude-aa.jsonl | codex-aa.jsonl]
- **Source format**: [Claude (.timestamp ms, .display) | Codex (.ts seconds, .text)]
- **Session files**: [~/.claude/projects/ | ~/.codex/sessions/]
## Required Output Categories
You MUST extract findings for ALL THREE categories:
1. **User Personality & Identity**
- How would you describe them as a person?
- What drives them? What are their goals?
- Communication style (beyond "direct" — humor, sarcasm, catchphrases?)
- Quirks, linguistic patterns, unique attributes
2. **Hard Rules & Preferences**
- Coding preferences — especially chronic failures (things the agent kept getting wrong)
- Workflow patterns (testing, commits, tools)
- What frustrates them and why
- Explicit "always/never" statements
3. **Project Context**
- Codebase structures, conventions, patterns
- Gotchas discovered through debugging
- Which files are safe to edit vs deprecated
If any category lacks data, explicitly state why.
## Required Extraction Dimensions
For each finding, prefer evidence that is:
- repeated across sessions
- tied to a concrete command, file path, or workflow
- useful for future execution without rereading history
You should specifically look for:
1. What the user is building and why it matters to them
2. Correction loops the agent repeatedly got wrong
3. Preferred commands and tooling patterns that were actually used successfully
4. Specific files or directories the user works in or treats as special
5. Project gotchas discovered through debugging or rollback requests
## Canonical Memory Promotion
Promote durable findings into focused files instead of leaving them trapped in generic ingestion notes. Prefer paths like:
- `system/human/identity.md`
- `system/human/prefs/communication.md`
- `system/human/prefs/workflow.md`
- `system/human/prefs/coding.md`
- `system/<project>/conventions.md`
- `system/<project>/gotchas.md`
Avoid generic repo facts unless they influence execution. "Uses TypeScript" is weak. "Uses bun:test, so vitest is wrong for this test suite" is useful.`
})
After all workers complete, merge their branches one at a time. Worker commits are preserved in git history.
CRITICAL: Merge the worker branches before doing any final cleanup synthesis. The correct pattern is:
main one by oneDo not bypass this by manually reapplying the final memory state onto main, because that loses the worker commits from the final history.
3a. Pre-read worker output before merging
Before merging, read each worker's files from their branch to understand what they found. This prevents information loss during conflict resolution:
cd [MEMORY_DIR]
for branch in $(git for-each-ref --format='%(refname:short)' refs/heads | grep -v '^main$'); do
echo "=== $branch ==="
git diff main..$branch --stat
# Read key files from the branch
git show $branch:system/human/identity.md # or equivalent user-identity file
git show $branch:system/<project>/conventions.md # or whatever focused files they created
done
3b. Merge branches one at a time
cd [MEMORY_DIR]
git merge [worker-branch] --no-edit -m "merge: worker N description"
Repeat for each worker branch. After all worker branches are merged, make a separate cleanup commit only if needed for final curation.
3c. Resolve conflicts by COMBINING, never compressing
CRITICAL: When resolving merge conflicts, be additive. Combine unique details from both sides. Never rewrite a file from scratch — you WILL lose information.
Rules for conflict resolution:
create_token_counter(), not direct instantiation" is more valuable than "prefers factory methods". Keep both.Example — BAD conflict resolution (compresses):
<<<<<<< HEAD
- Uses `uv` for Python
=======
- **CRITICAL: Always use `uv run`** — chronic failure; never bare pytest or python
- `uv run pytest -sv tests/...` for specific tests
- Never use bare `pytest` or `python` commands
>>>>>>> migration-xxx
# BAD: Picks one side or rewrites
- **Python**: `uv` exclusively — `uv run pytest`, never bare `pip`
Example — GOOD conflict resolution (combines):
# GOOD: Keeps emphasis and specificity from incoming side
**CRITICAL: Use `uv` exclusively for Python** — chronic failure.
- `uv run pytest -sv tests/...` for tests
- `uv run python` for scripts
- Never bare `pip`, `python`, or `pytest`
3d. Verify no information was lost
After all merges, compare the final files against what workers produced. Ask yourself: for each worker's output, can I find every specific detail (quotes, file paths, chronic failures, gotchas) somewhere in the final memory? If not, add it back.
3e. Clean up worktrees and branches
for w in $(dirname [MEMORY_DIR])/memory-worktrees/*; do
git worktree remove "$w" 2>/dev/null
done
git branch -d $(git for-each-ref --format='%(refname:short)' refs/heads | grep -v '^main$')
git push
Good output includes all three categories:
### User Personality & Identity
Pragmatic builder who values shipping over perfection. Gets frustrated when agents over-engineer or add "bonus" features. Uses dry humor and sarcasm when annoyed. Pattern: "scrappy startup engineer" — wants things to work, not to be architecturally pure.
### Hard Rules & Preferences
- **CRITICAL: Use `uv` for Python** — chronic failure ("you need to use uv", "make sure you use uv"); `uv run pytest -sv`, never bare `pytest`
- **Minimal changes only** — "just make a minor change stop adding all this stuff"
- **Only edit specified files** — when told to focus, stay focused
- Tests constantly: `uv run pytest -sv` (Python), `bun test` (TS)
### Project Context
- letta-cloud: Only edit `letta_agent_v3.py` — v1, v2, and base are deprecated
- Uses Biome for linting, not ESLint
- Conventional commits with scope in parens
After merging and curating, review the extracted history for repeatable multi-step workflows that would benefit from being codified as skills. History analysis often surfaces procedures the user runs frequently that the agent would otherwise have to rediscover each session.
Good candidates for skills:
If you identify candidates, either create them now (load the [[skills/creating-skills]] skill for guidance) or note them in memory for future creation:
# system/letta-code/overview.md
...
Potential skills to create:
- Debug workflow for HITL approval desync
- Integration test runner across providers
Don't force skill creation — only create them when you've found genuinely repeatable, multi-step procedures in the history.
| Problem | Cause | Fix |
|---|---|---|
Subagent exits with code null, 0 tool uses | letta.js not built | Run bun run build |
| Subagent hangs on "Tool requires approval" | Wrong subagent type | Use subagent_type: "history-analyzer" (workers) or "memory" (synthesis) |
| Merge conflict during synthesis | Workers touched overlapping files | Read both sides fully, combine unique details — never rewrite from scratch. See Step 5c. |
| Information lost after merge | Conflict resolution compressed worker output | Compare final files against each worker's branch output. Re-add missing specifics. See Step 5c. |
| Personality analysis missing or thin | Prompt didn't request it | Use the template above with explicit category requirements |
| Auth fails on push ("repository not found") | Credential helper broken or global helper conflict | Reconfigure repo-local helper and check/clear conflicting global credential.<host>.helper entries (see syncing-memory-filesystem skill) |
Do this in parallel with history analysis (Step 5). While workers process history, you should be actively exploring the codebase. This is your onboarding — invest real effort here.
IMPORTANT: The goal is to understand how the codebase actually works — not just its shape, but its substance. Directory listings and head -N snippets tell you what files exist; reading the actual implementation tells you how they work. By the end of this step, you should be able to describe how a key feature flows from entry point to implementation. If you can't, you haven't read enough.
After your initial scan (README, package manifest, top-level directories, and entry points), decide whether to fan out exploration.
Default rule:
This is the preferred path for medium-to-large repos, even in standard mode.
Explore based on chosen depth.
Standard (~20-40 tool calls total across the parent agent and any subagents):
Deep (100+ tool calls): Everything above, plus:
For medium-to-large repos, parallel exploration is the preferred strategy after your initial scan.
Use parallel subagents to investigate different subsystems simultaneously. Prefer a read-only exploration subagent when available. If your environment or user instructions discourage using an exploration subagent, do the equivalent exploration directly with Bash/Glob/Grep/Read.
Good subsystem boundaries include:
server/, client/, shared/api/, ui/, common/runtime/, cli/, tools/Subagent budget:
Each exploration subagent should return:
Launch exploration subagents in a single message so they run concurrently.
# After initial scan reveals key areas, launch parallel explorers in the background:
Task({
subagent_type: "explore",
description: "Explore API layer",
run_in_background: true,
prompt: `Read the implementation in src/api/.
Return:
1. key files and responsibilities
2. main abstractions and execution flow
3. non-obvious conventions
4. gotchas or deprecated paths
5. file paths worth storing in memory`
})
Task({
subagent_type: "explore",
description: "Explore frontend layer",
run_in_background: true,
prompt: `Read the implementation in src/ui/.
Return:
1. key files and responsibilities
2. major components and data flow
3. conventions and patterns
4. gotchas or fragile areas
5. file paths worth storing in memory`
})
Task({
subagent_type: "explore",
description: "Explore shared systems",
run_in_background: true,
prompt: `Read the implementation in src/shared/.
Return:
1. key files and responsibilities
2. shared abstractions
3. conventions and invariants
4. gotchas or deprecated paths
5. file paths worth storing in memory`
})
Do not sit idle while background workers are running. Continue project research and memory drafting while they run, and only check worker status when you are ready to integrate findings or have exhausted useful direct research.
When you are ready to integrate findings, retrieve the background subagent outputs and synthesize them into memory rather than repeating the same exploration yourself. Keep first-hand understanding of the entry points and core flow, but use subagent summaries to add subsystem-specific depth.
Source code (most important — don't skip this):
main.ts, index.ts, app.py, main.go, etc.Config & metadata:
Git history:
git log --oneline -20 — recent historygit branch -a — branching strategygit log --format="%s" -50 | head -20 — commit conventionsgit shortlog -sn --all | head -10 — main contributorsgit log --format="%an <%ae>" | sort -u — contributors with emailsAs you create/update memory files, add [[path]] references so your future self can find related context. These go inside the content of memory files:
Do NOT put everything in system/. Detailed reference material belongs in progressive memory — files outside system/ that can be loaded on demand through references.
Reference external memory from system/ files:
# system/letta-code/overview.md
...
For detailed architecture docs, see [[letta-code/architecture.md]]
Known footguns and edge cases: [[system/letta-code/gotchas.md]]
Reference skills from relevant context:
# system/letta-code/conventions.md
...
When committing, follow the workflow in [[skills/commit]]
For PR creation, use [[skills/review-pr]]
Create an index in overview files:
# system/letta-code/overview.md
CLI for interacting with Letta agents. Bun runtime, React/Ink TUI.
Entry points:
- `src/index.ts` — CLI arg parsing, agent resolution, startup
- `src/cli/App.tsx` — main TUI component (React/Ink)
- `src/agent/` — agent creation, memory, model handling
Key flows:
- Message send: index.ts → App.tsx → agent/message.ts → streaming
- Tool execution: tools/manager.ts → tools/impl/*
Links:
- [[system/letta-code/conventions.md]] — tooling, testing, commits
- [[system/letta-code/gotchas.md]] — common mistakes
- [[letta-code/architecture.md]] — detailed subsystem docs
This is a compact index, not a prose summary. It tells your future self where to start and where to find more.
Additional guidelines:
description in frontmatter that states its purpose, not a summary of contentssystem/ files focused and scannablesystem/Before finishing, review your work:
# Detect overlapping file/folder names (e.g. system/human.md AND system/human/)
find "$MEMORY_DIR" -name "*.md" | sed 's/\.md$//' | while read f; do
[ -d "$f" ] && echo "VIOLATION: $f.md conflicts with directory $f/"
done
If any violations are printed, fix them before committing (rename foo.md → foo/overview.md or merge the directory back into the file).
Also check: Does system/persona.md exist? All files have frontmatter with description?[[path]] so related context can be discovered quickly? Are external files referenced from in-context memory?letta-code/), not generic project/? Same for reference files.system/ truly needed every turn?system/ for detailed content? Did you review what history workers produced and keep their project context files? Are these files linked from system/ with [[path]] references?Check if they're satisfied or want further refinement. Then commit and push memory:
cd $MEMORY_DIR
git status # Review what changed before staging
git add <specific files> # Stage targeted paths — avoid blind `git add -A`
git commit --author="<AGENT_NAME> <<ACTUAL_AGENT_ID>@letta.com>" -m "feat(init): <summary> ✨
<what was initialized and key decisions made>"
git push
Use parallel tool calls wherever possible — read multiple files in a single turn, write multiple memory files in a single turn. This dramatically reduces init time.
Write findings to memory as you go — don't wait until the end.
Edit memory files directly via the filesystem — memory is projected to $MEMORY_DIR specifically for ease of bulk modification. Use standard file tools (Read, Write, Edit) and git to manage changes during initialization.