ComfyUI Model Types Explained: Checkpoint, LoRA, VAE, ControlNet & More
Understand the different model types used in ComfyUI — what each does, where to install it, and how they work together in a workflow.
Why So Many Model Types?
ComfyUI's power comes from its modular architecture — instead of one monolithic AI model, the generation pipeline is split into specialized components. Each model type handles a specific part of the process.
This guide explains what each model type does, where to install it, and when you need it.
Model Types at a Glance
| Type | Size | Purpose | Required? |
|---|---|---|---|
| Checkpoint | 2–7 GB | The main generation model — contains UNet, CLIP, and VAE bundled together | Yes (or use separate components) |
| LoRA | 10–300 MB | Small adapter that adds styles, characters, or concepts on top of a checkpoint | Optional |
| VAE | 100–500 MB | Converts between latent space and pixel space — affects color accuracy | Optional (checkpoints include a default VAE) |
| ControlNet | 700 MB–1.5 GB | Adds structural control (edges, depth, pose) to generation | Optional |
| Embedding | 1 KB–10 MB | Compressed prompt concept — used as a text token | Optional |
| CLIP | 250 MB–10 GB | Text encoder that converts prompts to vectors | Included in checkpoints, separate for Flux/SD3 |
| UNet / Diffusion Model | 5–25 GB | The core noise-prediction network | Included in checkpoints, separate for Flux/SD3/video |
| Upscaler | 20–200 MB | Enlarges images with AI-enhanced detail | Optional |
| CLIP Vision | 300 MB–1 GB | Encodes reference images (for I2V, Redux, IP-Adapter) | Only for specific workflows |
Detailed Breakdown
Checkpoints
The all-in-one model file. A checkpoint bundles three components:
- UNet — the neural network that predicts and removes noise
- CLIP — converts your text prompt into vectors
- VAE — translates between latent space and pixel space
Install location: ComfyUI/models/checkpoints/
Common models: Stable Diffusion 1.5, SDXL, Stable Diffusion 3.5, DreamShaper, Realistic Vision
Newer models like Flux and video models (Wan, HunyuanVideo) ship as separate components — you download the UNet, CLIP, and VAE individually. This is more flexible but requires more setup.
LoRA (Low-Rank Adaptation)
Small adapter files that modify a checkpoint's behavior without replacing it. LoRAs are trained to add specific styles, characters, or concepts.
Install location: ComfyUI/models/loras/
How to use: Add a Load LoRA node between Load Checkpoint and the rest of your workflow.
Key point: LoRAs must match their base model version — an SD1.5 LoRA won't work with SDXL.
See the LoRA Guide for detailed usage.
VAE (Variational Autoencoder)
Handles the final conversion from latent space (where the AI works) to pixel space (what you see). A checkpoint includes a default VAE, but you can override it with a standalone VAE for better color accuracy.
Install location: ComfyUI/models/vae/
When to use a separate VAE: If your images have washed-out colors or a color cast, try an external VAE like vae-ft-mse-840000-ema-pruned.safetensors.
ControlNet
Adds structural control — edges, depth maps, human poses — to guide generation. ControlNet models must match the base model version.
Install location: ComfyUI/models/controlnet/
How to use: Load a reference image → preprocess it → feed through Apply ControlNet node.
See the ControlNet Guide for detailed usage.
Embeddings (Textual Inversion)
Tiny files that encode a concept into a prompt token. Used directly in the text field — no extra nodes needed.
Install location: ComfyUI/models/embeddings/
How to use: Type embedding:filename in your prompt.
See the Embeddings Guide for detailed usage.
Upscale Models
AI models that enlarge images while adding detail. Used with the Upscale Image (using Model) node.
Install location: ComfyUI/models/upscale_models/
Common models: RealESRGAN x4, 4x-UltraSharp
See the Upscale Guide for detailed usage.
Separate Components (Flux, SD3, Video Models)
Modern models often ship as individual components instead of a bundled checkpoint:
| Component | Install Location | Used By |
|---|---|---|
| Diffusion Model / UNet | models/diffusion_models/ or models/unet/ | Flux, Wan, HunyuanVideo |
| Text Encoder / CLIP | models/clip/ or models/text_encoders/ | Flux, SD3, Wan |
| VAE | models/vae/ | All models |
| CLIP Vision | models/clip_vision/ | Redux, I2V, IP-Adapter |
| Style Models | models/style_models/ | Flux Redux |
Version Compatibility
This is the most common source of errors. Models are version-locked:
| Base Model | Compatible With |
|---|---|
| SD 1.5 | SD1.5 LoRAs, SD1.5 ControlNets, SD1.5 Embeddings |
| SDXL | SDXL LoRAs, SDXL ControlNets, SDXL Embeddings |
| SD 3.5 | SD3 specific components |
| Flux | Flux LoRAs, Flux ControlNets |
Never mix versions — an SD1.5 LoRA on an SDXL checkpoint will produce garbage or errors.
Organizing Your Models
As your collection grows, use subfolders to keep things manageable:
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ ├── SD1.5/
│ │ ├── SDXL/
│ │ └── Flux/
│ ├── loras/
│ │ ├── SD1.5/
│ │ └── SDXL/
│ ├── controlnet/
│ │ ├── SD1.5/
│ │ └── Flux/
│ └── embeddings/
│ └── SD1.5/ComfyUI reads subfolders automatically — models inside them will appear in dropdown menus with their folder path.
Related Guides
- Install Models — How to download and place model files
- LoRA Guide — Using LoRA adapters
- ControlNet Guide — Structural control
- Embeddings Guide — Textual Inversion usage
ComfyUI Interface Guide: Navigation, Nodes & Shortcuts
A visual guide to the ComfyUI interface — menu layout, node operations, keyboard shortcuts, and essential UI features for beginners.
ComfyUI Flux Guide: Setup, Workflows & VRAM Options
Complete guide to running Flux.1 in ComfyUI — model versions compared, download links, text-to-image and image-to-image workflows, and low-VRAM solutions.
Wonderful Launcher ドキュメント