Name: Add Model
Author: sgl-project

Add New Model to SGLang Cookbook

Interactive, multi-step workflow. Collect inputs incrementally — don't ask for everything upfront.

Phase 1: Collect Initial Inputs

Ask the user for:

Model Card — HuggingFace model name or URL (e.g., Qwen/Qwen3-Coder-Next). Fetch the page to extract description, capabilities, etc. If the model isn't public yet, ask the user to paste what they know (name, param count, architecture, capabilities, context length).
Model Variants — Multiple sizes (e.g., 480B/30B) or quantizations (BF16/FP8)? Which to include? This affects ConfigGenerator options, YAML entries, and doc examples. See Qwen3CoderConfigGenerator and Qwen3NextConfigGenerator for multi-variant patterns.
Deployment Command — Full sglang serve --model-path command with all flags (tp, dp, ep, etc.). Not python -m sglang.launch_server (deprecated, issue #33). If the model card provides one, use it as starting point but verify format.
— Version being tested (e.g., ). Determines YAML directory: .

Interactive, multi-step workflow. Collect inputs incrementally — don't ask for everything upfront.

Ask the user for:

Model Card — HuggingFace model name or URL (e.g., Qwen/Qwen3-Coder-Next). Fetch the page to extract description, capabilities, etc. If the model isn't public yet, ask the user to paste what they know (name, param count, architecture, capabilities, context length).
Model Variants — Multiple sizes (e.g., 480B/30B) or quantizations (BF16/FP8)? Which to include? This affects ConfigGenerator options, YAML entries, and doc examples. See Qwen3CoderConfigGenerator and Qwen3NextConfigGenerator for multi-variant patterns.
Deployment Command — Full sglang serve --model-path command with all flags (tp, dp, ep, etc.). Not python -m sglang.launch_server (deprecated, issue #33). If the model card provides one, use it as starting point but verify format.
— Version being tested (e.g., ). Determines YAML directory: .

Platform	Vendor	Memory	Docker Image
A100	NVIDIA	80GB	`lmsysorg/sglang:<ver>`
H100	NVIDIA	80GB	`lmsysorg/sglang:<ver>`
H200	NVIDIA	141GB	`lmsysorg/sglang:<ver>`
B200	NVIDIA	180GB	`lmsysorg/sglang:<ver>`
B300	NVIDIA	275GB	`lmsysorg/sglang:<ver>`
MI300X	AMD	192GB	`lmsysorg/sglang:<ver>-rocm720-mi30x`
MI325X	AMD	256GB	`lmsysorg/sglang:<ver>-rocm720-mi30x`
MI350X	AMD	288GB	`lmsysorg/sglang:<ver>-rocm720-mi35x`
MI355X	AMD	288GB	`lmsysorg/sglang:<ver>-rocm720-mi35x`