Name: Sglang Diffusion Benchmark Profile
Author: sgl-project

SGLang Diffusion Benchmark and Profile

Use this skill when measuring denoise performance, finding the slow op, checking whether an existing fast path can solve it, or verifying that a hotspot is real before any kernel work in sglang.multimodal_gen.

This skill is diagnosis-first. It owns:

checked-in denoise benchmark presets
perf dump collection and before/after comparison
torch.profiler trace capture and quick hotspot ranking
mapping hot kernels back to known fast paths and fusion families
handing confirmed kernel work to a specialized optimization skill such as ../sglang-diffusion-ako4all-kernel/SKILL.md

This skill does not own low-level kernel authoring or standalone Nsight workflows.

Preflight

Before running any benchmark, profiler, or kernel-validation command:

use scripts/diffusion_skill_env.py to derive the repo root from

Sglang Diffusion Benchmark Profile

Sglang Diffusion Benchmark Profile

SGLang Diffusion Benchmark and Profile

Preflight

Native Backend Gate

Main Reference

Opportunity Discovery Rule

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags