Name: Chat Perf
Author: microsoft

Chat Performance Testing

When to use

Before/after modifying chat rendering code (chatListRenderer.ts, chatInputPart.ts, markdown rendering)
When changing the streaming response pipeline or SSE processing
When modifying disposable/lifecycle patterns in chat components
To compare performance between two VS Code releases
In CI to gate PRs that touch chat UI code

Quick start

# Run perf regression test (compares local dev build vs VS Code 1.115.0):
npm run perf:chat -- --scenario text-only --runs 3

# Run all scenarios with no baseline (just measure):
npm run perf:chat -- --no-baseline --runs 3

# Compare two local builds (apples-to-apples):
npm run perf:chat -- --build /path/to/build-A --baseline-build /path/to/build-B --runs 5

# Build a local production package and compare against a release:
npm run perf:chat -- --production-build --baseline-build 1.115.0 --runs 5

# Run memory leak check (10 messages in one session):
npm run perf:chat-leak

# Run leak check with more messages for accuracy:
npm run perf:chat-leak -- --messages 20 --verbose

Chat Performance Testing

When to use

Before/after modifying chat rendering code (chatListRenderer.ts, chatInputPart.ts, markdown rendering)
When changing the streaming response pipeline or SSE processing
When modifying disposable/lifecycle patterns in chat components
To compare performance between two VS Code releases
In CI to gate PRs that touch chat UI code

Quick start

# Run perf regression test (compares local dev build vs VS Code 1.115.0):
npm run perf:chat -- --scenario text-only --runs 3

# Run all scenarios with no baseline (just measure):
npm run perf:chat -- --no-baseline --runs 3

# Compare two local builds (apples-to-apples):
npm run perf:chat -- --build /path/to/build-A --baseline-build /path/to/build-B --runs 5

# Build a local production package and compare against a release:
npm run perf:chat -- --production-build --baseline-build 1.115.0 --runs 5

# Run memory leak check (10 messages in one session):
npm run perf:chat-leak

# Run leak check with more messages for accuracy:
npm run perf:chat-leak -- --messages 20 --verbose

Flag	Default	Description
`--runs <n>`	`5`	Runs per scenario. More = more stable. Use 5+ for CI.
`--scenario <id>` / `-s`	all	Scenario to test (repeatable). See `common/perf-scenarios.js`.
`--build <path\|ver>` / `-b`	local dev	Build to test. Accepts path or version (`1.110.0`, `insiders`, commit hash).
`--baseline <path>`	—	Compare against a previously saved baseline JSON file.
`--baseline-build <path\|ver>`	`1.115.0`	Version or local path to benchmark as baseline.
`--no-baseline`	—	Skip baseline comparison entirely.
`--save-baseline`	—	Save results as the new baseline (requires `--baseline <path>`).
`--resume <path>`	—	Resume a previous run, adding more iterations to increase confidence.
`--threshold <frac>`	`0.2`	Regression threshold (0.2 = flag if 20% slower).
`--production-build`	—	Build a local bundled package via `gulp vscode` for comparison against a release baseline.
`--no-cache`	—	Ignore cached baseline data, always run fresh.
`--force`	—	Skip build mode mismatch confirmation prompt.
`--ci`	—	CI mode: write Markdown summary to `ci-summary.md` (implies `--no-cache`).
`--setting <k=v>`	—	Set a VS Code setting override for all builds (repeatable).
`--test-setting <k=v>`	—	Set a VS Code setting override for the test build only.
`--baseline-setting <k=v>`	—	Set a VS Code setting override for the baseline build only.
`--verbose`	—	Print per-run details including response content.

Mode	Source	Characteristics
`dev`	`.build/electron/` (local dev)	Unbundled sources, `VSCODE_DEV=1`, `NODE_ENV=development`. Higher memory and startup overhead.
`production`	`../VSCode-<platform>-<arch>/` (from `gulp vscode`)	Bundled JS, no dev flags. Matches release characteristics but uses local source.
`release`	`.vscode-test/` (downloaded via `@vscode/test-electron`)	Official published build.

Flag	Default	Description
`--messages <n>` / `-n`	`10`	Number of messages to send. More = more accurate slope.
`--build <path\|ver>` / `-b`	local dev	Build to test.
`--threshold <MB>`	`2`	Max per-message heap growth in MB.
`--setting <k=v>`	—	Set a VS Code setting override (repeatable).
`--verbose`	—	Print per-message heap/DOM counts.

Chat Perf

Chat Performance Testing

When to use

Quick start

Chat Perf

Chat Performance Testing

When to use

Quick start

Perf regression test

Key flags

Comparing two remote builds

Comparing two local builds

Build modes and mismatch detection

Settings overrides

Resuming a run for more confidence

Statistical significance

Exit codes

Scenarios

Metrics collected

Regression triggers vs informational metrics

Statistics

Memory leak check

Key flags

What it measures

Interpreting results

Architecture

Mock server

Adding a scenario

Openclaw Qa Testing

OpenClaw Test Heap Leaks

Session Logs

Openclaw Secret Scanning Maintainer

Node Connect

Flags

Chat Perf

Chat Performance Testing

When to use

Quick start

Chat Perf

Chat Performance Testing

When to use

Quick start

Perf regression test

Key flags

Comparing two remote builds

Comparing two local builds

Build modes and mismatch detection

Settings overrides

Resuming a run for more confidence

Statistical significance

Exit codes

Scenarios

Metrics collected

Regression triggers vs informational metrics

Statistics

Memory leak check

Key flags

What it measures

Interpreting results

Architecture

Mock server

Adding a scenario

Related skills

Openclaw Qa Testing

OpenClaw Test Heap Leaks

Session Logs

Openclaw Secret Scanning Maintainer

Node Connect

Flags