Extract memory candidates from past conversation sessions and let the user choose what to save
Scan past Claude Code sessions for this project, extract memory-worthy information, and present candidates for the user to choose from.
/distill-sessions # today's sessions (default)
/distill-sessions --all # all sessions for this project
/distill-sessions --today # today's sessions only
/distill-sessions --session <id> # specific session by ID
Before doing anything else, run the setup script:
bash ~/.claude/skills/distill-sessions/scripts/setup-hook.sh
This script will:
~/.claude/settings.json already has the memory-gate hook registeredIf the script reports "status":"newly_registered":
Display this message:
"Memory-gate hook has been newly registered in settings.json. A session restart is required. Please exit with
/exit, then relaunch and re-run/distill-sessions. (Without the hook, memory writes cannot be validated.)"
Then STOP. Do not continue.
If the script reports "status":"already_registered", proceed normally.
Find session files for the current project directory.
# Project session directory pattern:
# ~/.claude/projects/{encoded-cwd}/*.jsonl
PROJECT_DIR=$(echo "$PWD" | sed 's|/|-|g; s|^-||')
SESSION_DIR="$HOME/.claude/projects/-${PROJECT_DIR}"
List all .jsonl files in that directory (excluding /subagents/).
Also cross-reference with ~/.claude/sessions/*.json to get metadata (pid, startedAt, name).
Filtering:
--today (default): only sessions from today--all: all sessions found--session <id>: match the specific session IDIf no sessions are found, inform the user and stop.
For each discovered session, try the fast --resume path first, and fall
back to lossless chunking if the session is too large to load.
Pass 1 — --resume (fast path). Works for small/medium sessions where
the entire conversation fits in the model's context window. If this fails
with Prompt is too long (or similar size error), proceed to Pass 2.
# Pass 1: try --resume with sonnet 5 attempts → haiku 5 attempts on transient errors
MODEL="sonnet"
MAX_RETRIES=5
RESULT=""
TOO_LONG=0
for attempt in $(seq 1 $MAX_RETRIES); do
RESULT=$(claude -p --resume <session-id> \
--permission-mode default \
--allowedTools "Read Grep Glob" \
--model $MODEL \
"<extraction prompt below>" 2>&1) && break
if [[ "$RESULT" == *"Prompt is too long"* ]]; then
TOO_LONG=1
break
fi
echo "sonnet attempt $attempt failed, retrying..." >&2
sleep 2
done
if [[ $TOO_LONG -eq 0 ]] && [[ -z "$RESULT" || "$RESULT" == *"overloaded"* || "$RESULT" == *"Error"* ]]; then
MODEL="haiku"
for attempt in $(seq 1 $MAX_RETRIES); do
RESULT=$(claude -p --resume <session-id> \
--permission-mode default \
--allowedTools "Read Grep Glob" \
--model $MODEL \
"<extraction prompt below>" 2>&1) && break
if [[ "$RESULT" == *"Prompt is too long"* ]]; then
TOO_LONG=1
break
fi
echo "haiku attempt $attempt failed, retrying..." >&2
sleep 2
done
fi
Pass 2 — chunking (lossless fallback). When TOO_LONG=1, chunk the
raw .jsonl losslessly and extract per chunk with a sliding-window summary:
if [[ $TOO_LONG -eq 1 ]]; then
# 1) Chunk the session (lossless: keeps everything except file-history-snapshot,
# splits oversized single messages with [LARGE MESSAGE k/N] markers).
CHUNKS_DIR=$(mktemp -d)
python3 ~/.claude/skills/distill-sessions/scripts/chunk_and_extract.py \
"$SESSION_JSONL" --out-dir "$CHUNKS_DIR" --max-chars 80000 --overlap 2
# 2) Extract candidates per chunk with cumulative summary.
RESULT=$(bash ~/.claude/skills/distill-sessions/scripts/extract-from-chunks.sh \
"$CHUNKS_DIR" sonnet)
rm -rf "$CHUNKS_DIR"
fi
The chunker enforces:
[LARGE MESSAGE k/N] parts (no truncation)extract-from-chunks.sh calls claude -p per chunk in order, prepending a
running 2K-char summary of prior chunks so the model keeps cross-chunk
continuity without re-reading earlier content. Output is a merged JSON
array of candidates from all chunks.
Extraction prompt:
Analyze this conversation and extract ONLY information worth remembering
for future sessions. Focus on:
1. **user**: Role, preferences, knowledge level, work style
2. **feedback**: Corrections ("don't do X"), confirmations ("yes, exactly like that")
3. **project**: Non-obvious context about goals, deadlines, decisions, stakeholders
4. **reference**: Pointers to external systems (Linear projects, Slack channels, dashboards)
Do NOT extract:
- Code changes, file paths, or architecture (derivable from code)
- Git history or debugging solutions (derivable from git)
- Anything already in CLAUDE.md
- Ephemeral task details
For each candidate, output as JSON array:
[
{
"type": "user|feedback|project|reference",
"title": "short title",
"content": "the memory content",
"why": "why this is worth remembering"
}
]
If nothing is worth remembering, return: []
Run sessions in parallel where possible (up to 3 concurrent).
Chunked extraction (Pass 2) often produces many near-duplicate candidates because each chunk is processed independently and the same fact may surface across several chunks. Run an LLM-based dedup pass to collapse them before presenting to the user:
# Write all candidates from every session into a single JSON array file,
# then run the dedup pass. The script returns only the merged result and
# never calls the LLM when the input is empty.
ALL_CANDS=$(mktemp --suffix=.json)
DEDUPED=$(mktemp --suffix=.json)
echo "$COMBINED_CANDIDATES_JSON" > "$ALL_CANDS"
python3 ~/.claude/skills/distill-sessions/scripts/dedup_candidates.py \
--input "$ALL_CANDS" \
--output "$DEDUPED" \
--model sonnet
RESULT=$(cat "$DEDUPED")
rm -f "$ALL_CANDS" "$DEDUPED"
After the LLM dedup pass:
memory/MEMORY.md and ~/.claude/CLAUDE.md) — this check stays on the Claude side, not the dedup script, because it requires reading local memory files.The dedup script merges semantically duplicate candidates, keeps the richest phrasing for each field, and outputs a JSON array of objects with exactly type, title, content, why. It's a pure post-processing step — the input file is never mutated.
Display the merged candidate list, grouped by type, numbered for selection:
## Memory candidates from N sessions
### user
1. [title] — one-line summary
2. [title] — one-line summary
### feedback
3. [title] — one-line summary
4. [title] — one-line summary
### project
5. [title] — one-line summary
(N candidates total)
Then use AskUserQuestion to ask:
If the user picks "Pick", ask them to list the numbers (e.g., "1, 3, 5").
CRITICAL: After the user selects candidates AND before any Write/Edit to memory, write an approval file so the memory-gate hook can verify user consent.
# Build approval file with selected filenames
# This file is checked by validate-memory.sh — without it, all memory writes are blocked.
cat > /tmp/memory-approved.json <<EOF
{
"approved_at": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"session_id": "<current-session-id>",
"files": [
"feedback_exhaustive_search.md",
"user_work_style.md"
]
}
EOF
The files array must contain the exact filenames (basename only) that will be written.
Also include "CLAUDE.md" if global saves are selected.
After all writes are complete, clean up:
rm -f /tmp/memory-approved.json
For each selected candidate, decide where to save:
~/.claude/CLAUDE.md) — if the candidate applies across all projects
(e.g., general work style preferences, universal feedback)memory/) — if the candidate is specific to this project
(e.g., project decisions, project-specific conventions)Use AskUserQuestion to confirm the split:
For global saves (~/.claude/CLAUDE.md):
For project memory saves:
.md files to the project memory directory with frontmatter:---