Model Integration Skill

Use this when adding a new image or video generative model, or a new model version, to the Draw Things app / CLI, especially when most of the surrounding components already exist and the task is mainly integration.

This skill is optimized for the current SwiftDiffusion / Draw Things layout:

architecture builder in Libraries/SwiftDiffusion/Sources/Models
weight-loading logic returned as ModelWeightMapper from model builders and helper blocks
text path in Libraries/SwiftDiffusion/Sources/TextEncoder.swift
fixed encoder path in Libraries/SwiftDiffusion/Sources/UNetFixedEncoder.swift
runtime UNet / DiT path in Libraries/SwiftDiffusion/Sources/Models/UNetProtocol.swift
VAE path in Libraries/SwiftDiffusion/Sources/FirstStage.swift

In this repo, UNet in names such as UNetProtocol, UNetFixedEncoder, and UNetExtractConditions is the legacy name for the main diffusion model integration boundary. The underlying architecture may be a DiT or another non-UNet model.

Model Integration Skill

This skill is optimized for the current SwiftDiffusion / Draw Things layout:

architecture builder in Libraries/SwiftDiffusion/Sources/Models
weight-loading logic returned as ModelWeightMapper from model builders and helper blocks
text path in Libraries/SwiftDiffusion/Sources/TextEncoder.swift
fixed encoder path in Libraries/SwiftDiffusion/Sources/UNetFixedEncoder.swift
runtime UNet / DiT path in Libraries/SwiftDiffusion/Sources/Models/UNetProtocol.swift
VAE path in Libraries/SwiftDiffusion/Sources/FirstStage.swift

New Model Integration

Model Integration Skill

New Model Integration

Model Integration Skill

Default Approach

Workflow

1. Add the main model builder and weight mapping

2. Introduce the new `ModelVersion`

3. Hook tokenizer plumbing

4. Hook text encoding and decide the `llm_adapter` boundary

5. Integrate `UNetFixedEncoder` and `UNetProtocol`

6. Sweep converter and quantizer tools

7. Hook VAE through the existing first-stage path

8. Validate end-to-end before cleanup

Bring-Up Heuristics

Common Failure Modes

Checklist

Coding Agent (bash-first)

Install Vscode Extension

Launch

Agent Customization

Init

Launch

New Model Integration

Model Integration Skill

New Model Integration

Model Integration Skill

Default Approach

Workflow

1. Add the main model builder and weight mapping

2. Introduce the new ModelVersion

3. Hook tokenizer plumbing

4. Hook text encoding and decide the llm_adapter boundary

5. Integrate UNetFixedEncoder and UNetProtocol

6. Sweep converter and quantizer tools

7. Hook VAE through the existing first-stage path

8. Validate end-to-end before cleanup

Bring-Up Heuristics

Common Failure Modes

Checklist

Coding Agent (bash-first)

Install Vscode Extension

Launch

Agent Customization

Init

Launch

2. Introduce the new `ModelVersion`

4. Hook text encoding and decide the `llm_adapter` boundary

5. Integrate `UNetFixedEncoder` and `UNetProtocol`