Use when adding a new model or pipeline to diffusers, setting up file structure for a new model, converting a pipeline to modular format, or converting weights for a new version of an already-supported model.
Integrate a new model into diffusers end-to-end. The overall flow:
parity-testing skill to verify numerical correctness against the reference."__init__.py.parity-testing skill to verify component and e2e parity against the reference implementation.Work one workflow at a time — get it to full parity before moving on.
<!-- TODO: Add concrete examples as we encounter them. Common patterns to watch for: - Fused QKV weights that need splitting into separate Q, K, V - Scale/shift ordering differences (reference stores [shift, scale], diffusers expects [scale, shift]) - Weight transpositions (linear stored as transposed conv, or vice versa) - Interleaved head dimensions that need reshaping - Bias terms absorbed into different layers Add each with a before/after code snippet showing the conversion. -->Before writing any code, gather info in this order:
Use AskUserQuestion with structured choices for step 3 when the options are known.
src/diffusers/
models/transformers/transformer_<model>.py # The core model
schedulers/scheduling_<model>.py # If model needs a custom scheduler
pipelines/<model>/
__init__.py
pipeline_<model>.py # Main pipeline
pipeline_<model>_<variant>.py # Variant pipelines (e.g. pyramid, distilled)
pipeline_output.py # Output dataclass
loaders/lora_pipeline.py # LoRA mixin (add to existing file)
tests/
models/transformers/test_models_transformer_<model>.py
pipelines/<model>/test_<model>.py
lora/test_lora_layers_<model>.py
docs/source/en/api/
pipelines/<model>.md
models/<model>_transformer3d.md # or appropriate name
from_pretrained support__call__ method__init__.py files (lazy imports)make style and make qualityparity-testing skill)See ../../models.md for the attention pattern, implementation rules, common conventions, dependencies, and gotchas. These apply to all model work.
Don't combine structural changes with behavioral changes. Restructuring code to fit diffusers APIs (ModelMixin, ConfigMixin, etc.) is unavoidable. But don't also "improve" the algorithm, refactor computation order, or rename internal variables for aesthetics. Keep numerical logic as close to the reference as possible, even if it looks unclean. For standard → modular, this is stricter: copy loop logic verbatim and only restructure into blocks. Clean up in a separate commit after parity is confirmed.
@slow and RUN_SLOW=1BaseModelTesterConfig, ModelTesterMixin, MemoryTesterMixin, AttentionTesterMixin, LoraTesterMixin, and TrainingTesterMixin classes initially to write the tests. Any additional tests should be added after discussions with the maintainers. Use tests/models/transformers/test_models_transformer_flux.py as a reference.See modular.md for the full guide on modular pipeline conventions, block types, build order, guider abstraction, gotchas, and conversion checklist.