Depth Pattern (Fetch-on-Confirmation)

Instead of loading full memory entries upfront, agents fetch in 3 depths:

Depth 1: IDs only        (~10 tokens per match)
  Agent decides which are worth investigating

Depth 2: Summary         (~50 tokens per match)
  Room, type, preview (first 80 chars)
  Agent confirms relevance

Depth 3: Full content    (~500+ tokens per match)
  Only fetched for confirmed matches

Example flow:

1. Agent searches "auth refresh token"
2. Depth 1 returns 8 IDs: d-abc123, d-def456, ...
3. Agent requests Depth 2 for IDs 1-3
4. Sees room=authentication, type=decision, preview="Chose JWT..."
5. Agent confirms IDs 1,3 are relevant
6. Requests Depth 3 only for those 2 entries
7. Gets full content for ~1000 tokens instead of 4000+

Progressive memory system with two orthogonal dimensions of lazy loading:

Scope layers - What is relevant (identity, project, domain, deep)
Depth layers - How much detail to fetch (IDs, summary, full)

Combined savings: 10-50x tokens vs eager loading.

Depth Pattern (Fetch-on-Confirmation)

Instead of loading full memory entries upfront, agents fetch in 3 depths:

Depth 1: IDs only        (~10 tokens per match)
  Agent decides which are worth investigating

Depth 2: Summary         (~50 tokens per match)
  Room, type, preview (first 80 chars)
  Agent confirms relevance

Depth 3: Full content    (~500+ tokens per match)
  Only fetched for confirmed matches

Example flow:

1. Agent searches "auth refresh token"
2. Depth 1 returns 8 IDs: d-abc123, d-def456, ...
3. Agent requests Depth 2 for IDs 1-3
4. Sees room=authentication, type=decision, preview="Chose JWT..."
5. Agent confirms IDs 1,3 are relevant
6. Requests Depth 3 only for those 2 entries
7. Gets full content for ~1000 tokens instead of 4000+

Layer	Tokens	When
L1	~200	Always
L2	~500	Per project
L3	~1-2K	Per task domain
L4	~2-5K	On demand
Total max	~8K	Worst case

Layered Recall

Depth Pattern (Fetch-on-Confirmation)

Layered Recall

Depth Pattern (Fetch-on-Confirmation)

The 4 Layers

Layer Details

Layer 1: Identity (~200 tokens, ALWAYS loaded)

Layer 2: Critical Facts (~500 tokens, per-project)

Layer 3: Room Recall (~1-2K tokens, on-demand)

Layer 4: Deep Search (~2-5K tokens, explicit)

Recall Flow

Token Budget

Integration

With Existing Hooks

With Memory Palace

With Agents

Helm Chart Scaffolding

Python Observability

K8s Manifest Generator

Istio Traffic Management

Secrets Management

Gitops Workflow