Visual memory system — split storage, component forgetting, state merging, transition dedup.
Each app/site stores memory in four independent files (not one monolithic profile):
memory/apps/<appname>/
├── meta.json # Metadata: detect_count, forget_threshold, img_size
├── components.json # Component registry with activity tracking
├── states.json # States defined by component sets
├── transitions.json # State transitions (dict, deduped by key)
├── components/ # Template images (cropped UI elements)
└── pages/ # Full page screenshots
{
"app": "chromium",
"domain": "united.com",
"detect_count": 47,
"last_updated": "2026-03-23 15:30:00",
"img_size": [1920, 1080],
"forget_threshold": 15
}
detect_count: total times detect_all has been called for this app/siteforget_threshold: consecutive misses before a component is auto-deleted (default 15){
"Travel_info": {
"type": "text",
"source": "ocr",
"rel_x": 661, "rel_y": 188,
"w": 80, "h": 20,
"icon_file": "components/Travel_info.png",
"label": "Travel info",
"confidence": 0.95,
"page": "homepage",
"learned_at": "2026-03-23 02:20:00",
"last_seen": "2026-03-23 15:30:00",
"seen_count": 12,
"consecutive_misses": 0
}
}
Activity tracking fields (auto-managed, LLM does not set these):
last_seen: last time this component was detected on screenseen_count: total times detectedconsecutive_misses: how many consecutive detect_all calls missed this component (resets to 0 on detection){
"s_a3f2c1": {
"name": "homepage",
"description": "United Airlines homepage with booking form",
"defining_components": ["nav_bar", "book_button", "Travel_info"],
"visible_texts": ["United", "Book", "Travel info"],
"first_seen": "2026-03-23 02:20:00",
"last_seen": "2026-03-23 15:30:00",
"visit_count": 5
}
}
defining_components set, not by naming conventions_ + 6-char hex hash of the component set{
"s_a3f2c1|click:Travel_info|s_b7d4e2": {
"from_state": "s_a3f2c1",
"action": "click:Travel_info",
"to_state": "s_b7d4e2",
"count": 3,
"last_used": "2026-03-23 15:30:00",
"success_rate": 1.0
}
}
from_state|action|to_state — natural dedupcount and last_used instead of creating duplicatesThese run inside learn_from_screenshot() — the LLM just calls the function, everything below is automatic.
Every time learn_from_screenshot() runs:
last_seen = now, seen_count += 1, consecutive_misses = 0consecutive_misses += 1detect_count > forget_threshold AND consecutive_misses >= forget_threshold:
defining_componentsdefining_components becomes empty → delete the stateAfter component activity update:
seen_count >= 2) → stable_setstable_set against each state's defining_components via Jaccard similarityvisit_count and last_seens_ + hash IDAfter state identification:
visit_count state, union defining_componentsBrowsers host many websites, each with its own UI. The browser itself has memory for browser-level UI (toolbar, settings). Each website gets its own nested directory with the same 4-file structure:
memory/apps/chromium/
├── meta.json # Browser-level metadata
├── components.json # Browser UI components (toolbar, etc.)
├── states.json
├── transitions.json
├── components/
├── pages/
└── sites/ # ⭐ Each website = same structure
├── united.com/
│ ├── meta.json
│ ├── components.json
│ ├── states.json
│ ├── transitions.json
│ ├── components/
│ └── pages/
├── delta.com/
│ └── ...
└── ...
Rules:
sites/<domain>/ with the 4 filesunited_com, not www.united.com/en/us)Old profile.json files are automatically migrated to the new split format on first load:
profile.json → split into meta.json + components.json + states.json + transitions.jsonprofile.json.baklast_seen, seen_count, consecutive_misses# Learn (detect + save + auto-forget + auto-state)
python3 scripts/agent.py learn --app AppName
# List components
python3 scripts/agent.py list --app AppName
# Rename unlabeled
python3 scripts/app_memory.py rename --app AppName --old unlabeled_xxx --new actual_name
# Delete (privacy, dynamic content)
python3 scripts/app_memory.py delete --app AppName --component name
# View state graph
python3 scripts/app_memory.py transitions --app AppName
# Find navigation path
python3 scripts/app_memory.py path --app AppName --component from_state --contact to_state
auto_cleanup_dynamicimage tool → rename or delete