ComfyUI Model Types Explained: Checkpoint, LoRA, VAE, ControlNet & More

Understand the different model types used in ComfyUI — what each does, where to install it, and how they work together in a workflow.

Why So Many Model Types?

ComfyUI's power comes from its modular architecture — instead of one monolithic AI model, the generation pipeline is split into specialized components. Each model type handles a specific part of the process.

This guide explains what each model type does, where to install it, and when you need it.

Model Types at a Glance

Type	Size	Purpose	Required?
Checkpoint	2–7 GB	The main generation model — contains UNet, CLIP, and VAE bundled together	Yes (or use separate components)
LoRA	10–300 MB	Small adapter that adds styles, characters, or concepts on top of a checkpoint	Optional
VAE	100–500 MB	Converts between latent space and pixel space — affects color accuracy	Optional (checkpoints include a default VAE)
ControlNet	700 MB–1.5 GB	Adds structural control (edges, depth, pose) to generation	Optional
Embedding	1 KB–10 MB	Compressed prompt concept — used as a text token	Optional
CLIP	250 MB–10 GB	Text encoder that converts prompts to vectors	Included in checkpoints, separate for Flux/SD3
UNet / Diffusion Model	5–25 GB	The core noise-prediction network	Included in checkpoints, separate for Flux/SD3/video
Upscaler	20–200 MB	Enlarges images with AI-enhanced detail	Optional
CLIP Vision	300 MB–1 GB	Encodes reference images (for I2V, Redux, IP-Adapter)	Only for specific workflows

Detailed Breakdown

Checkpoints

The all-in-one model file. A checkpoint bundles three components:

UNet — the neural network that predicts and removes noise
CLIP — converts your text prompt into vectors
VAE — translates between latent space and pixel space

Install location: ComfyUI/models/checkpoints/

Common models: Stable Diffusion 1.5, SDXL, Stable Diffusion 3.5, DreamShaper, Realistic Vision

Newer models like Flux and video models (Wan, HunyuanVideo) ship as separate components — you download the UNet, CLIP, and VAE individually. This is more flexible but requires more setup.

LoRA (Low-Rank Adaptation)

Small adapter files that modify a checkpoint's behavior without replacing it. LoRAs are trained to add specific styles, characters, or concepts.

Install location: ComfyUI/models/loras/

How to use: Add a Load LoRA node between Load Checkpoint and the rest of your workflow.

Key point: LoRAs must match their base model version — an SD1.5 LoRA won't work with SDXL.

See the LoRA Guide for detailed usage.

VAE (Variational Autoencoder)

Handles the final conversion from latent space (where the AI works) to pixel space (what you see). A checkpoint includes a default VAE, but you can override it with a standalone VAE for better color accuracy.

Install location: ComfyUI/models/vae/

When to use a separate VAE: If your images have washed-out colors or a color cast, try an external VAE like vae-ft-mse-840000-ema-pruned.safetensors.

ControlNet

Adds structural control — edges, depth maps, human poses — to guide generation. ControlNet models must match the base model version.

Install location: ComfyUI/models/controlnet/

How to use: Load a reference image → preprocess it → feed through Apply ControlNet node.

See the ControlNet Guide for detailed usage.

Embeddings (Textual Inversion)

Tiny files that encode a concept into a prompt token. Used directly in the text field — no extra nodes needed.

Install location: ComfyUI/models/embeddings/

How to use: Type embedding:filename in your prompt.

See the Embeddings Guide for detailed usage.

Upscale Models

AI models that enlarge images while adding detail. Used with the Upscale Image (using Model) node.

Install location: ComfyUI/models/upscale_models/

Common models: RealESRGAN x4, 4x-UltraSharp

See the Upscale Guide for detailed usage.

Separate Components (Flux, SD3, Video Models)

Modern models often ship as individual components instead of a bundled checkpoint:

Component	Install Location	Used By
Diffusion Model / UNet	`models/diffusion_models/` or `models/unet/`	Flux, Wan, HunyuanVideo
Text Encoder / CLIP	`models/clip/` or `models/text_encoders/`	Flux, SD3, Wan
VAE	`models/vae/`	All models
CLIP Vision	`models/clip_vision/`	Redux, I2V, IP-Adapter
Style Models	`models/style_models/`	Flux Redux

Version Compatibility

This is the most common source of errors. Models are version-locked:

Base Model	Compatible With
SD 1.5	SD1.5 LoRAs, SD1.5 ControlNets, SD1.5 Embeddings
SDXL	SDXL LoRAs, SDXL ControlNets, SDXL Embeddings
SD 3.5	SD3 specific components
Flux	Flux LoRAs, Flux ControlNets

Never mix versions — an SD1.5 LoRA on an SDXL checkpoint will produce garbage or errors.

Organizing Your Models

As your collection grows, use subfolders to keep things manageable:

ComfyUI/
├── models/
│   ├── checkpoints/
│   │   ├── SD1.5/
│   │   ├── SDXL/
│   │   └── Flux/
│   ├── loras/
│   │   ├── SD1.5/
│   │   └── SDXL/
│   ├── controlnet/
│   │   ├── SD1.5/
│   │   └── Flux/
│   └── embeddings/
│       └── SD1.5/

ComfyUI reads subfolders automatically — models inside them will appear in dropdown menus with their folder path.

Install Models — How to download and place model files
LoRA Guide — Using LoRA adapters
ControlNet Guide — Structural control
Embeddings Guide — Textual Inversion usage

Source References

Start with Wonderful Launcher if this issue touches your real ComfyUI environment. Use the docs to understand the fix, and use the app to inspect the machine you already have.

Download Wonderful Launcher

Did this fix your issue?

Your answer helps prioritize verified ComfyUI repairs.

Why So Many Model Types?

This guide explains what each model type does, where to install it, and when you need it.

Model Types at a Glance

Type	Size	Purpose	Required?
Checkpoint	2–7 GB	The main generation model — contains UNet, CLIP, and VAE bundled together	Yes (or use separate components)
LoRA	10–300 MB	Small adapter that adds styles, characters, or concepts on top of a checkpoint	Optional
VAE	100–500 MB	Converts between latent space and pixel space — affects color accuracy	Optional (checkpoints include a default VAE)
ControlNet	700 MB–1.5 GB	Adds structural control (edges, depth, pose) to generation	Optional
Embedding	1 KB–10 MB	Compressed prompt concept — used as a text token	Optional
CLIP	250 MB–10 GB	Text encoder that converts prompts to vectors	Included in checkpoints, separate for Flux/SD3
UNet / Diffusion Model	5–25 GB	The core noise-prediction network	Included in checkpoints, separate for Flux/SD3/video
Upscaler	20–200 MB	Enlarges images with AI-enhanced detail	Optional
CLIP Vision	300 MB–1 GB	Encodes reference images (for I2V, Redux, IP-Adapter)	Only for specific workflows

Detailed Breakdown

Checkpoints

The all-in-one model file. A checkpoint bundles three components:

UNet — the neural network that predicts and removes noise
CLIP — converts your text prompt into vectors
VAE — translates between latent space and pixel space

Install location: ComfyUI/models/checkpoints/

Common models: Stable Diffusion 1.5, SDXL, Stable Diffusion 3.5, DreamShaper, Realistic Vision

Newer models like Flux and video models (Wan, HunyuanVideo) ship as separate components — you download the UNet, CLIP, and VAE individually. This is more flexible but requires more setup.

LoRA (Low-Rank Adaptation)

Small adapter files that modify a checkpoint's behavior without replacing it. LoRAs are trained to add specific styles, characters, or concepts.

Install location: ComfyUI/models/loras/

How to use: Add a Load LoRA node between Load Checkpoint and the rest of your workflow.

Key point: LoRAs must match their base model version — an SD1.5 LoRA won't work with SDXL.

See the LoRA Guide for detailed usage.

VAE (Variational Autoencoder)

Install location: ComfyUI/models/vae/

When to use a separate VAE: If your images have washed-out colors or a color cast, try an external VAE like vae-ft-mse-840000-ema-pruned.safetensors.

ControlNet

Adds structural control — edges, depth maps, human poses — to guide generation. ControlNet models must match the base model version.

Install location: ComfyUI/models/controlnet/

How to use: Load a reference image → preprocess it → feed through Apply ControlNet node.

See the ControlNet Guide for detailed usage.

Embeddings (Textual Inversion)

Tiny files that encode a concept into a prompt token. Used directly in the text field — no extra nodes needed.

Install location: ComfyUI/models/embeddings/

How to use: Type embedding:filename in your prompt.

See the Embeddings Guide for detailed usage.

Upscale Models

AI models that enlarge images while adding detail. Used with the Upscale Image (using Model) node.

Install location: ComfyUI/models/upscale_models/

Common models: RealESRGAN x4, 4x-UltraSharp

See the Upscale Guide for detailed usage.

Separate Components (Flux, SD3, Video Models)

Modern models often ship as individual components instead of a bundled checkpoint:

Component	Install Location	Used By
Diffusion Model / UNet	`models/diffusion_models/` or `models/unet/`	Flux, Wan, HunyuanVideo
Text Encoder / CLIP	`models/clip/` or `models/text_encoders/`	Flux, SD3, Wan
VAE	`models/vae/`	All models
CLIP Vision	`models/clip_vision/`	Redux, I2V, IP-Adapter
Style Models	`models/style_models/`	Flux Redux

Version Compatibility

This is the most common source of errors. Models are version-locked:

Base Model	Compatible With
SD 1.5	SD1.5 LoRAs, SD1.5 ControlNets, SD1.5 Embeddings
SDXL	SDXL LoRAs, SDXL ControlNets, SDXL Embeddings
SD 3.5	SD3 specific components
Flux	Flux LoRAs, Flux ControlNets

Never mix versions — an SD1.5 LoRA on an SDXL checkpoint will produce garbage or errors.

Organizing Your Models

As your collection grows, use subfolders to keep things manageable:

ComfyUI/
├── models/
│   ├── checkpoints/
│   │   ├── SD1.5/
│   │   ├── SDXL/
│   │   └── Flux/
│   ├── loras/
│   │   ├── SD1.5/
│   │   └── SDXL/
│   ├── controlnet/
│   │   ├── SD1.5/
│   │   └── Flux/
│   └── embeddings/
│       └── SD1.5/

ComfyUI reads subfolders automatically — models inside them will appear in dropdown menus with their folder path.

Install Models — How to download and place model files
LoRA Guide — Using LoRA adapters
ControlNet Guide — Structural control
Embeddings Guide — Textual Inversion usage

Source References

Start with Wonderful Launcher if this issue touches your real ComfyUI environment. Use the docs to understand the fix, and use the app to inspect the machine you already have.

Download Wonderful Launcher

Did this fix your issue?

Your answer helps prioritize verified ComfyUI repairs.

ComfyUI Model Types Explained: Checkpoint, LoRA, VAE, ControlNet & More

Why So Many Model Types?

Model Types at a Glance

Detailed Breakdown

Checkpoints

LoRA (Low-Rank Adaptation)

VAE (Variational Autoencoder)

ControlNet

Embeddings (Textual Inversion)

Upscale Models

Separate Components (Flux, SD3, Video Models)

Version Compatibility

Organizing Your Models

Source References

Table of Contents

ComfyUI Model Types Explained: Checkpoint, LoRA, VAE, ControlNet & More

Why So Many Model Types?

Model Types at a Glance

Detailed Breakdown

Checkpoints

LoRA (Low-Rank Adaptation)

VAE (Variational Autoencoder)

ControlNet

Embeddings (Textual Inversion)

Upscale Models

Separate Components (Flux, SD3, Video Models)

Version Compatibility

Organizing Your Models

Source References

Table of Contents