[Beta] Cross-session analysis of accumulated .learnings/ files. Reads all entries, groups by pattern_key, computes recurrence across sessions, and outputs ranked promotion candidates. This is the outer loop's inspect step — it turns raw learning data into actionable gap reports. Use on a regular cadence (weekly, before major tasks, or at session start for critical projects). Can be invoked manually or scheduled.
Reads accumulated .learnings/ files across all sessions, finds patterns, and produces a ranked list of promotion candidates. This is the outer loop's inspect step.
Without this skill, .learnings/ is a write-only log. Patterns accumulate but nobody synthesizes them. The same gap resurfaces two weeks later because no one looked.
promotion_ready — verify the flag with full contextA gap report — a ranked list of patterns that have crossed (or are approaching) the promotion threshold, with evidence and recommended actions.
Read these files in .learnings/:
| File | Contains |
|---|---|
LEARNINGS.md | Corrections, knowledge gaps, best practices, recurring patterns |
ERRORS.md | Command failures, API errors, exceptions |
FEATURE_REQUESTS.md | Missing capabilities |
Parse each entry's metadata:
Pattern-Key — the stable deduplication keyRecurrence-Count — how many times this pattern has been seenFirst-Seen / Last-Seen — date rangePriority — low / medium / high / criticalStatus — pending / promotion_ready / promoted / dismissedArea — frontend / backend / infra / tests / docs / configRelated Files — which parts of the codebase are affectedSource — conversation / error / user_feedback / simplify-and-hardenTags — free-form labelsGroup entries by Pattern-Key. For each group:
For entries without a Pattern-Key, use conservative grouping only:
Area AND at least 2 identical TagsRelated Files path (exact path match, not substring)Flag ungrouped entries separately with a recommendation to assign a Pattern-Key. Ungrouped entries are common and expected — they may be one-off issues or genuinely novel problems.
An entry is promotion-ready when:
Recurrence-Count >= 3 across the group>= 2 distinct tasks30-day windowAn entry is approaching when:
Recurrence-Count >= 2 orPriority: high/critical with any recurrenceFor each promotion candidate, classify the gap type:
| Gap Type | Signal | Fix Target |
|---|---|---|
| Knowledge gap | Agent didn't know X | Update project instruction files (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md) |
| Tool gap | Agent improvised around missing capability | Add or update MCP tool / script |
| Skill gap | Same behavior pattern keeps failing | Create or update a skill (use /skill-creator, validate with quick_validate.py, register skill-check eval) |
| Ambiguity | Conflicting interpretations of spec/prompt | Tighten instructions or add examples |
| Reasoning failure | Agent had the knowledge but reasoned wrong | Add explicit decision rules or constraints |
Output a structured report:
## Learning Aggregator: Gap Report
**Scan date:** YYYY-MM-DD
**Period:** [since date] to [now]
**Entries scanned:** N
**Patterns found:** N
**Promotion-ready:** N
**Approaching threshold:** N
### Promotion-Ready Patterns
#### 1. [Pattern-Key] — [Summary]
- **Recurrence:** N times across M tasks
- **Window:** First-Seen → Last-Seen
- **Priority:** high
- **Gap type:** knowledge gap
- **Area:** backend
- **Related files:** path/to/file.ext
- **Evidence:**
- [LRN-YYYYMMDD-001] Summary of first occurrence
- [LRN-YYYYMMDD-002] Summary of second occurrence
- [ERR-YYYYMMDD-001] Summary of related error
- **Recommended action:** Add rule to project instruction files (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md): "[concise prevention rule]"
- **Eval candidate:** Yes — [description of what to test]
#### 2. ...
### Approaching Threshold
#### 1. [Pattern-Key] — [Summary]
- **Recurrence:** 2 times across 1 task
- **Needs:** 1 more recurrence or 1 more distinct task
- ...
### Ungrouped Entries (no Pattern-Key)
- [LRN-YYYYMMDD-005] "Summary" — needs pattern_key assignment
- ...
### Dismissed / Stale
- Entries with Last-Seen > 90 days ago and Status: pending → recommend dismissal
The gap report feeds into:
--since YYYY-MM-DD — only scan entries after this date--min-recurrence N — raise the promotion threshold--area AREA — filter to a specific area (frontend, backend, etc.)--deep — also analyze session traces via Entire (see Session Trace Analysis below)The outer loop reads from two complementary sources:
| Source | What it is | Cadence | Cost |
|---|---|---|---|
.learnings/ | Explicit entries written by self-improvement during sessions. Agent's own reflections: corrections, knowledge gaps, recurring patterns it noticed. | Every session (hot path) | Near-zero |
| Session traces | Full session transcripts captured by Entire: prompts, tool calls, outputs, files modified, token usage, checkpoints. | Weekly or on-demand (cold path) | Expensive — only run at cadence |
The default mode reads .learnings/ and produces a gap report from what the agent explicitly logged. The --deep mode also analyzes session traces and merges findings from both sources.
.learnings/ captures what the agent noticed and chose to log — a curated subset. Session traces capture everything that happened, including patterns the agent worked around, retried, or never recognized as failures.
Examples of patterns visible in traces but absent from .learnings/:
These patterns are high-value for the outer loop because the agent can't self-report them. Session traces are the only source.
Trace analysis is not per-session. It's cadenced:
/learning-aggregator --deep --since 7dRunning trace analysis per-session would burn tokens without producing new signal — cross-session patterns only emerge over multiple sessions.
When --deep is requested, the skill uses the entire CLI to query shadow branch data:
# Check availability
entire --version
# List recent checkpoints as JSON (id, date, session_id, message, tool_use_id)
entire rewind --list
# Read a checkpoint's full transcript
entire explain --checkpoint <id> --full --no-pager
# Or raw JSONL
entire explain --checkpoint <id> --raw-transcript --no-pager
# Filter to one session
entire explain --session <session-id-prefix>
# Generate AI summary (expensive, use sparingly)
entire explain --checkpoint <id> --generate
If entire is not installed or the current repo doesn't have Entire enabled, --deep falls back to .learnings/-only mode and reports the limitation in the gap report.
For each checkpoint within the time window, parse the raw transcript and look for:
retry-loop.<tool>correction.<area>error-detector.sh (error, failed, Traceback, etc.) → Pattern-key: error.<category>drift.<signal>approach-switch.<domain>cost.<task-type>Each finding is normalized to the same taxonomy as self-improvement (harden.input_validation, simplify.dead_code, etc.) where possible.
When --deep runs, each pattern in the gap report gets a sources field:
promotion_ready:
- pattern_key: "harden.input_validation"
recurrence_count: 5
sources:
- .learnings/LEARNINGS.md (3 entries)
- entire:traces (5 occurrences across 4 sessions)
confidence: high # appears in both sources
evidence:
- "LRN-20260401-001: Missing bounds check on pagination"
- "entire:1ca16f9b: Retry loop on /api/search — pageSize rejected 4 times"
- "entire:8bf2e4cd: User correction 'validate before DB query'"
entire_checkpoints:
- 1ca16f9bb3801ee2a02f2384f31355a54b81ea00
- 8bf2e4cd63d01040b38df07c43f73e0f15d05ac9
A pattern in both sources is higher confidence than one from either alone. A pattern only in .learnings/ might be over-logged by a diligent agent. A pattern only in traces might be noise. The overlap is where the signal is strongest.
The default implementation targets Entire (v0.5.4+) via the entire rewind --list and entire explain commands. The concept is source-agnostic — any session capture tool that exposes:
...can serve as a trace source. Adapters for other capture tools can be added in scripts/ or via gh-aw mcp-scripts.
Reads .learnings/ from the working directory. This is the only persistence mode — the skill does not integrate with external memory backends in interactive sessions. For CI-side durable storage across workflow runs, see learning-aggregator-ci, which can optionally back its state with gh-aw's repo-memory (git-branch persistence). The resulting branch is a normal git branch and can be fetched locally if desired, but the interactive skill itself only reads local files.
Each promotion candidate in the gap report includes a tracker field set to the pattern-key. This tracker propagates through the full chain: harness-updater embeds it as a comment in project instruction files, eval-creator references it in eval cases. To audit the full lifecycle of a pattern, search for tracker:[pattern-key] across the repo and GitHub.
.learnings/ files (read-only analysis)--deep trace analysis per-session — only on cadence or explicit invocation.learnings/-only mode when trace source is unavailable