Name: Sglang Auto Benchmark
Author: sgl-project

This skill is for repeatable, AI-driven SGLang performance tuning.

The preferred workflow is:

The implementation lives in:

python -m sglang.auto_benchmark
canonical dataset loader in python -m sglang.bench_serving --dataset-name autobench
cookbook-derived LLM reference configs in .claude/skills/sglang-auto-benchmark/references/cookbook-llm/

Preconditions

SGLang can already launch and serve the target model in this environment.
The model path exists, or the model is otherwise launchable.
The goal is clear:
- benchmark a fixed QPS list, or
- search the maximum QPS that satisfies max_ttft_ms / max_tpot_ms.

If those are not true yet, fix them before running a large search.

Environment consistency check:

This skill is for repeatable, AI-driven SGLang performance tuning.

The preferred workflow is:

The implementation lives in:

python -m sglang.auto_benchmark
canonical dataset loader in python -m sglang.bench_serving --dataset-name autobench
cookbook-derived LLM reference configs in .claude/skills/sglang-auto-benchmark/references/cookbook-llm/

SGLang can already launch and serve the target model in this environment.
The model path exists, or the model is otherwise launchable.
The goal is clear:
- benchmark a fixed QPS list, or
- search the maximum QPS that satisfies max_ttft_ms / max_tpot_ms.

If those are not true yet, fix them before running a large search.

Environment consistency check:

Sglang Auto Benchmark