Develops production-ready Triton kernels for Liger Kernel. Creates new kernels from PyTorch operations (local files, URLs, code snippets, or natural language) with ops, module wrappers, functional APIs, unit tests, benchmarks, and plots. Also modifies existing Liger kernels. Use when adding a new Triton kernel, converting a PyTorch operation to Triton, or updating an existing Liger kernel.
Develops Triton kernels for Liger Kernel through a 3-stage pipeline with human review between stages. Supports creating new kernels and modifying existing ones. NVIDIA GPUs only.
Spawn an Analyzer agent (read analyzer.md).
Accepts any input: local file, URL, code snippet, natural language description, or model component reference. Produces a standalone PyTorch reference implementation and a kernel profile.
Human checkpoint: Present PyTorch reference + kernel profile. Confirm before proceeding.
Spawn a Generator agent (read generator.md).
Generates/modifies up to 8 files:
src/liger_kernel/ops/{kernel}.py — NEW Triton kernels + autograd Functionsrc/liger_kernel/transformers/{kernel}.py — NEW nn.Module wrappersrc/liger_kernel/transformers/functional.py — MODIFY add functional APIsrc/liger_kernel/ops/__init__.py — MODIFY export Function classsrc/liger_kernel/transformers/__init__.py — MODIFY export Module + __all__test/transformers/test_{kernel}.py — NEW unit testsbenchmark/scripts/benchmark_{kernel}.py — NEW benchmark scriptbenchmark/data/all_benchmark_data.csv — MODIFY (after benchmarks run)Human checkpoint: Present changes for review.
Spawn a Validator agent (read validator.md).
Runs checkstyle, unit tests (hard gate — stops on persistent failure), benchmarks, and generates plots. Optionally runs ncu profiling.
Human checkpoint: Report final results with benchmark numbers and plots.