Run agent-driven VS Code performance or memory investigations. Use when asked to launch Code OSS, automate a VS Code scenario, run the Chat memory smoke runner, capture renderer heap snapshots, take workflow screenshots, compare run summaries, or drive a repeatable scenario before heap-snapshot analysis.
Drive a repeatable VS Code scenario, collect memory/performance artifacts, verify that the scenario actually happened, then hand the resulting heap snapshots to the generic heap-snapshot-analysis skill when object-level investigation is needed.
summary.json, renderer heap samples, and targeted .heapsnapshot files for one scenarioDo not use this skill when snapshots already exist and the user only wants heap object/retainer analysis. Use heap-snapshot-analysis directly.
summary.json and screenshots. Do not analyze a failed login, trust prompt, stuck progress row, or wrong UI state.The scripts/ folder contains stable, generic runners. Use them directly or as templates for scratchpad scripts:
Use the bundled Chat memory smoke runner when the scenario is Chat-specific or can be expressed as repeated Chat prompts. It launches Code OSS, opens Chat, sends prompts, waits for responses, writes screenshots and summary.json, samples renderer heap, and can take selected heap snapshots.
Fast health check:
node .github/skills/auto-perf-optimize/scripts/chat-memory-smoke.mts --iterations 3 --no-heap-snapshots
Targeted post-warmup snapshots:
node .github/skills/auto-perf-optimize/scripts/chat-memory-smoke.mts --iterations 8 --heap-snapshot-label 03-iteration-01 --heap-snapshot-label 03-iteration-08
User-described Chat scenario:
node .github/skills/auto-perf-optimize/scripts/chat-memory-smoke.mts --iterations 8 --message 'For memory investigation iteration {iteration}, summarize the active workspace in one paragraph.' --heap-snapshot-label 03-iteration-01 --heap-snapshot-label 03-iteration-08
Important runner behavior:
.build/auto-perf-optimize/user-data so auth can be reused by all runners in this skill.--temporary-user-data only if a clean profile is part of the scenario.--seed-user-data-dir <path> to copy a logged-in profile into a fresh target profile before launch. The target profile may contain auth secrets; keep it inside ignored local .build/... folders and never attach it to issues or PRs.Safety: chat runs execute on the real machine. The Code OSS instance launched by these runners is a full VS Code with Copilot auth on the user's actual computer — not a sandbox. Chat prompts you craft will be sent to a real LLM, and any tool calls the agent makes (terminal commands, file edits, etc.) will execute for real. Be responsible:
--workspace <scratch-folder> pointing to a temporary or gitignored directory (e.g., the runner's scratchpad subfolder, or a folder under .build/). The default workspace in checked-in runners is the repo root for convenience, but scratchpad runners for Chat scenarios should always override it to avoid accidental file modifications in the source tree.touch /tmp/foo, git log --oneline, ls). Never instruct the agent to delete files, run destructive commands, or modify the user's workspace.--keep-open when the user needs to log in or watch the window, then close the window before the next automated run unless intentionally reusing it.--reuse only when attaching to a Code window that was launched with --enable-smoke-test-driver and the chosen remote-debugging port.Prefer the shared persistent performance profile for routine runs:
node .github/skills/auto-perf-optimize/scripts/chat-memory-smoke.mts --keep-open --iterations 1 --no-heap-snapshots
If Chat asks for auth, let the user sign in once, close the Code window, then rerun the fast smoke without --keep-open. The same profile is reused by the bundled Chat runner and by other runners that follow this skill's profile convention.
To bootstrap the shared performance profile from an older logged-in automation profile, copy it once into the default target:
node .github/skills/auto-perf-optimize/scripts/chat-memory-smoke.mts --seed-user-data-dir .build/chat-memory-smoke/user-data --keep-open --iterations 1 --no-heap-snapshots
To run a fresh disposable copy of a logged-in seed:
node .github/skills/auto-perf-optimize/scripts/chat-memory-smoke.mts --temporary-user-data --seed-user-data-dir .build/auto-perf-optimize/user-data --iterations 3 --no-heap-snapshots
Seed-copy rules:
--user-data-dir, use --temporary-user-data, or delete the local target deliberately.--user-data-dir <fresh-path> --seed-user-data-dir <seed-path> when you want to keep the copied profile after the run. User-provided --user-data-dir is never deleted by the runner.The first version of an automation runner is rarely correct. Treat the runner as a test you are developing: run a cheap scenario, observe the live workbench, adjust one selector or wait condition, and repeat. Do not collect heap snapshots until the runner is boringly reliable.
New runners go in the scratchpad folder (gitignored). Checked-in scripts in scripts/ are stable, generic runners — don't modify them for a one-off investigation. Instead, copy patterns from them into a scratchpad script.
Organize scratchpad work into dated subfolders named YYYY-MM-DD-short-description/ (e.g., 2026-04-09-chat-scroll-leak/). Each subfolder should contain:
.mts, .mjs, etc.)findings.md file documenting the full investigation: all ideas considered, which ones led to changes and which were rejected (and why), before/after measurements, and a summary of the outcome. This lets the user review the agent's reasoning, decide which changes to keep, and follow up on deferred ideas.Start fresh. Ignore any existing scratchpad subfolders from previous investigations. They belong to earlier sessions and their context, scripts, and findings are not relevant to your current task. Always create a new dated subfolder for your investigation.
Import path depth: Scripts in dated subfolders are 6 levels below the repo root (.github/skills/auto-perf-optimize/scratchpad/YYYY-MM-DD-name/script.mts), not 4 like the checked-in scripts/*.mts runners. Adjust relative imports accordingly — use 5 .. segments to reach the repo root from a dated subfolder (e.g., '../../../../../src/vs/base/common/stopwatch.ts'), and '../../scripts/userDataProfile.mts' to reach sibling checked-in scripts.
Suggested watch loop for the bundled Chat runner:
node .github/skills/auto-perf-optimize/scripts/chat-memory-smoke.mts --keep-open --iterations 1 --no-heap-snapshots --port 9224 --output .build/chat-memory-smoke/watch-chat
While that Code window is open, inspect it with agent-browser from the repo root:
npx agent-browser connect 9224
npx agent-browser tab
npx agent-browser snapshot -i
npx agent-browser screenshot .build/chat-memory-smoke/watch-chat/agent-browser-observation.png
Agent-browser checkpoints:
tab first. If the selected target is about:blank or a webview instead of the workbench, switch targets before trusting snapshots.snapshot -i to rediscover buttons, textboxes, list rows, webviews, and current accessible names. Prefer discovered state over stale selectors..build/... folder. Do not use /tmp for screenshots you expect the user to review.summary.json before killing the window. The last submitted turn and last screenshot usually identify the missing wait condition.--keep-open, let the user sign in once in the persistent default profile, close the window, then rerun the fast smoke.When editing a scenario runner:
summary.json, checkpoint screenshots, heap samples, optional heap/*.heapsnapshot files, and an error field on failure.--no-heap-snapshots first. A broken runner plus a 2GB heap snapshot wastes time and hides the real failure.--keep-open or --reuse.Read the run's summary.json before opening heap snapshots. Check:
error is absentchatTurns has the expected count--skip-sendanalysis.postFirstTurnUsedBytes and analysis.postFirstTurnUsedBytesPerTurn are present for multi-turn memory probesheap/Prefer a warmed-up baseline such as 03-iteration-01.heapsnapshot over startup snapshots. Startup, Chat opening, login, extension activation, and first-use model loads are expected allocations.
After capture, use heap-snapshot-analysis. A minimal scratchpad comparison script looks like this:
import path from 'node:path';
import { compareSnapshots, printComparison } from '../helpers/compareSnapshots.ts';
const runDir = process.env.RUN;
if (!runDir) {
throw new Error('Set RUN to a chat-memory-smoke output directory');
}
const before = path.join(runDir, 'heap', '03-iteration-01.heapsnapshot');
const after = path.join(runDir, 'heap', '03-iteration-08.heapsnapshot');
printComparison(compareSnapshots(before, after));
Run it from the heap-snapshot-analysis skill folder:
cd .github/skills/heap-snapshot-analysis
RUN=../../../.build/chat-memory-smoke/<run-folder> node --max-old-space-size=16384 scratchpad/compare-chat-run.mjs
When the user describes a non-Chat scenario, ask only for the missing essentials: what action starts the scenario, what counts as one repeatable iteration, what indicates the UI is settled, and whether the profile should be persistent or temporary.
Write new scenario runners in the scratchpad folder. This folder is gitignored — use it freely for one-off investigation scripts. If a runner proves generally useful, promote it to scripts/ with documentation and validation.
Put each investigation in a dated subfolder (see "Develop and Watch a Runner" for the naming convention).
Example scratchpad workflow:
# Create a dated investigation folder
mkdir -p .github/skills/auto-perf-optimize/scratchpad/2026-04-09-editor-tab-leak
# Write a runner inside it
cat > .github/skills/auto-perf-optimize/scratchpad/2026-04-09-editor-tab-leak/scenario.mts << 'EOF'
// ... your scenario using patterns from the checked-in scripts
EOF
# Validate without snapshots first
node .github/skills/auto-perf-optimize/scratchpad/2026-04-09-editor-tab-leak/scenario.mts \
--iterations 3 --no-heap-snapshots --skip-prelaunch \
--user-data-dir .build/chat-memory-smoke/user-data
# Then capture targeted snapshots
node .github/skills/auto-perf-optimize/scratchpad/2026-04-09-editor-tab-leak/scenario.mts \
--iterations 10 --heap-snapshot-label baseline --heap-snapshot-label final \
--skip-prelaunch --user-data-dir .build/chat-memory-smoke/user-data
# Write findings.md when the investigation concludes
Reuse these patterns from the checked-in scripts (chat-memory-smoke.mts, chat-session-switch-smoke.mts):
scripts/code.sh or scripts/code.bat--enable-smoke-test-driver, --disable-workspace-trust, a known --remote-debugging-port, explicit --user-data-dir, explicit --extensions-dir, --skip-welcome, and --skip-release-notes--workspace <scratch-folder>) instead of the repo root to prevent Chat tool calls from modifying real source fileschromium.connectOverCDPglobalThis.driver?.whenWorkbenchRestored?.()Performance and HeapProfilersummary.json incrementally, especially before long waits--no-heap-snapshots and targeted snapshot labels so validation stays fastKeep scenario-specific UI selectors and wait logic in the scenario runner. Avoid making the Chat runner a generic abstraction unless multiple proven scenarios share the exact same lifecycle.
Use heap-snapshot-analysis when you need to:
.heapsnapshot files by constructor/object groupThe output of this workflow is evidence: run summaries, screenshots, heap samples, targeted snapshots, comparison output, and retainer paths. Use that evidence to form a concrete leak hypothesis, then fix the product code and verify the fix with another run.
A surface-level observation ("this Map is growing") is not a diagnosis. Before writing a fix, understand why the code is structured the way it is:
git blame and git log on the leaking code. Read the commit message, the PR description, and any linked issues. A guard like if (this._isDisposed) return may exist because removing it once caused crashes — understand the original intent before changing it.disposeContext() is silently dropped, ask: why is the parent disposed before the child? Is the disposal order wrong, or is the guard wrong? The answer determines whether you fix the guard, fix the disposal order, or add a different cleanup path.UriIdentityService._canonicalUris with its 2^16 limit) is a cache, not a leak. Don't "fix" caches unless they lack any eviction policy.delete call — ask whether the registration should happen at all for transient objects, or whether an intermediate scoped registry should exist.The goal is to fix the cause, not paper over the effect. A fix that adds cleanup code without understanding why cleanup was missing will often introduce new bugs or re-break a previous fix.
The goal of this workflow is to ship fixes, not produce reports. After identifying leaks:
Make product-code changes that address the root cause. Common patterns:
clear() idle items but leave _inUse orphaned — also dispose _inUse on clearContextKeyService._contexts, HoverService._managedHovers, UriIdentityService._canonicalUris) that grow because transient objects register but never unregister_register(service.createScoped(...)) is correct but the parent dispose() is never calledautorunIterableDelta lastValues) that retain stale model referencesVerify the fix by rerunning the same scenario with the same snapshot labels. Compare the postFirst*UsedBytes trend and the snapshot diff. A successful fix should show flat or decreasing memory in the iteration phase.
Run all tests. Before finishing, run all unit tests and integration tests for any files you changed. Unit tests in this repo are expected to be stable — any unit test failure is very likely caused by your changes and must be fixed. Integration tests are slightly more prone to flakiness, but failures should still be investigated.
Document results in the scratchpad findings.md and session memory before declaring done: what leaked, what was fixed, before/after measurements.
Do not stop at analysis. If you have evidence of a leak, attempt a fix. If the fix is unclear or risky, explain why and propose alternatives.