How to Fix CUDA Out of Memory Errors in ComfyUI

Fix ComfyUI CUDA out of memory (OOM) errors by adjusting generation settings, using memory-efficient options, and understanding VRAM limits.

Community Knowledge

This page is based on common ComfyUI troubleshooting patterns and has not been fully tested across all environments. Back up your environment before changing packages.

If ComfyUI crashes with torch.cuda.OutOfMemoryError: CUDA out of memory, your GPU does not have enough VRAM for the current operation.

This is not a broken installation — it means the generation settings, model size, or resolution exceed what your GPU can hold in memory at once. The good news is that most OOM errors can be fixed without buying a new GPU.

Fast answer

Try these in order:

Lower your image resolution (e.g., from 1024x1024 to 768x768)
Launch ComfyUI with --lowvram or --novram flags
Use fp8 or fp16 model checkpoints instead of fp32
Enable tiled VAE decoding in your workflow

What the error looks like

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 GiB.
GPU 0 has a total capacity of 8.00 GiB of which 512.00 MiB is free.
Including non-PyTorch memory, this process has 7.48 GiB memory in use.

You may also see variations like:

RuntimeError: CUDA error: out of memory

Error occurred when executing KSampler:
CUDA out of memory.

The error typically occurs during sampling, VAE decoding, or model loading.

Why it happens

Resolution too high: Larger images use exponentially more VRAM. A 2048x2048 image uses roughly 4x the VRAM of a 1024x1024 image
Model too large for your GPU: SDXL models need more VRAM than SD 1.5 models. FLUX models need even more
Multiple models loaded: Loading a checkpoint, ControlNet, IP-Adapter, and LoRA simultaneously adds up
Batch size too high: Generating multiple images at once multiplies VRAM usage
VAE decode at full resolution: The VAE decode step can spike VRAM even when sampling succeeded
Other programs using GPU memory: Chrome, Discord, or other GPU-accelerated apps consume VRAM

VRAM requirements by model type

Model	Minimum VRAM	Comfortable VRAM	Notes
SD 1.5	4 GB	6 GB	Most compatible
SDXL	6 GB	8 GB	Default 1024x1024
FLUX.1	8 GB (quantized)	12 GB	fp8 recommended for 8GB cards
Video models	12 GB	24 GB	Highly variable

Step-by-step fixes

1. Lower resolution and batch size

The fastest fix. Reduce your image dimensions:

SD 1.5: try 512x512 or 512x768
SDXL: try 768x768 or 832x1216
FLUX: try 768x768 first

Set batch size to 1 if it is higher.

2. Use ComfyUI memory management flags

Launch ComfyUI with memory-saving arguments:

.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --lowvram

Available flags:

Flag	Effect	VRAM saved
`--lowvram`	Moves model parts to CPU during generation	Significant
`--novram`	Keeps almost everything on CPU, moves to GPU only when needed	Maximum, but much slower
`--cpu`	Runs entirely on CPU	All GPU VRAM free, very slow
`--disable-smart-memory`	Disables automatic memory management	Try if auto management causes issues

3. Use quantized or fp16 models

Full fp32 models use twice the VRAM of fp16 models. fp8 models use even less:

Download fp16 checkpoints when available
For FLUX, use fp8 quantized checkpoints on 8 GB GPUs
Some checkpoint sites label these as "fp16-fix" or "pruned"

4. Enable tiled VAE decode

If the OOM happens during VAE decode (the final step), add a VAE Decode (Tiled) node instead of the regular VAE Decode. Tiled decoding processes the image in smaller chunks.

5. Close other GPU applications

Check what else is using VRAM:

nvidia-smi

Close browsers, games, video editors, or other applications that consume GPU memory before generating.

6. Use --force-fp16 for the entire pipeline

.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --force-fp16

This forces all operations to use fp16 precision, roughly halving VRAM usage at the cost of minor quality differences.

What not to do

Do not increase Windows virtual memory (pagefile) thinking it will fix GPU OOM — VRAM and system RAM are separate
Do not reinstall ComfyUI — this is a resource limit, not a broken install
Do not install random CUDA toolkits — the PyTorch wheel includes what it needs
Do not ignore the specific numbers in the error message — they tell you exactly how much VRAM you have and how much was requested

How Wonderful Launcher can help

Wonderful Launcher can detect your GPU's VRAM capacity and help identify and suggest memory optimization settings. It also helps manage model formats and suggests compatible settings for your hardware.

Download Wonderful Launcher — it's free and can optimize your ComfyUI setup for your specific GPU.

Source References

You can fix it manually, or download Wonderful Launcher for Windows to diagnose plugin errors, missing dependencies, and broken ComfyUI environments without reinstalling.

Download free for Windows

Did this fix your issue?

Your answer helps prioritize verified ComfyUI repairs.

Community Knowledge

This page is based on common ComfyUI troubleshooting patterns and has not been fully tested across all environments. Back up your environment before changing packages.

If ComfyUI crashes with torch.cuda.OutOfMemoryError: CUDA out of memory, your GPU does not have enough VRAM for the current operation.

Fast answer

Try these in order:

Lower your image resolution (e.g., from 1024x1024 to 768x768)
Launch ComfyUI with --lowvram or --novram flags
Use fp8 or fp16 model checkpoints instead of fp32
Enable tiled VAE decoding in your workflow

What the error looks like

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 GiB.
GPU 0 has a total capacity of 8.00 GiB of which 512.00 MiB is free.
Including non-PyTorch memory, this process has 7.48 GiB memory in use.

You may also see variations like:

RuntimeError: CUDA error: out of memory

Error occurred when executing KSampler:
CUDA out of memory.

The error typically occurs during sampling, VAE decoding, or model loading.

Why it happens

Resolution too high: Larger images use exponentially more VRAM. A 2048x2048 image uses roughly 4x the VRAM of a 1024x1024 image
Model too large for your GPU: SDXL models need more VRAM than SD 1.5 models. FLUX models need even more
Multiple models loaded: Loading a checkpoint, ControlNet, IP-Adapter, and LoRA simultaneously adds up
Batch size too high: Generating multiple images at once multiplies VRAM usage
VAE decode at full resolution: The VAE decode step can spike VRAM even when sampling succeeded
Other programs using GPU memory: Chrome, Discord, or other GPU-accelerated apps consume VRAM

VRAM requirements by model type

Model	Minimum VRAM	Comfortable VRAM	Notes
SD 1.5	4 GB	6 GB	Most compatible
SDXL	6 GB	8 GB	Default 1024x1024
FLUX.1	8 GB (quantized)	12 GB	fp8 recommended for 8GB cards
Video models	12 GB	24 GB	Highly variable

Step-by-step fixes

1. Lower resolution and batch size

The fastest fix. Reduce your image dimensions:

SD 1.5: try 512x512 or 512x768
SDXL: try 768x768 or 832x1216
FLUX: try 768x768 first

Set batch size to 1 if it is higher.

2. Use ComfyUI memory management flags

Launch ComfyUI with memory-saving arguments:

.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --lowvram

Available flags:

Flag	Effect	VRAM saved
`--lowvram`	Moves model parts to CPU during generation	Significant
`--novram`	Keeps almost everything on CPU, moves to GPU only when needed	Maximum, but much slower
`--cpu`	Runs entirely on CPU	All GPU VRAM free, very slow
`--disable-smart-memory`	Disables automatic memory management	Try if auto management causes issues

3. Use quantized or fp16 models

Full fp32 models use twice the VRAM of fp16 models. fp8 models use even less:

Download fp16 checkpoints when available
For FLUX, use fp8 quantized checkpoints on 8 GB GPUs
Some checkpoint sites label these as "fp16-fix" or "pruned"

4. Enable tiled VAE decode

If the OOM happens during VAE decode (the final step), add a VAE Decode (Tiled) node instead of the regular VAE Decode. Tiled decoding processes the image in smaller chunks.

5. Close other GPU applications

Check what else is using VRAM:

nvidia-smi

Close browsers, games, video editors, or other applications that consume GPU memory before generating.

6. Use --force-fp16 for the entire pipeline

.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --force-fp16

This forces all operations to use fp16 precision, roughly halving VRAM usage at the cost of minor quality differences.

What not to do

Do not increase Windows virtual memory (pagefile) thinking it will fix GPU OOM — VRAM and system RAM are separate
Do not reinstall ComfyUI — this is a resource limit, not a broken install
Do not install random CUDA toolkits — the PyTorch wheel includes what it needs
Do not ignore the specific numbers in the error message — they tell you exactly how much VRAM you have and how much was requested

How Wonderful Launcher can help

Download Wonderful Launcher — it's free and can optimize your ComfyUI setup for your specific GPU.

Source References

You can fix it manually, or download Wonderful Launcher for Windows to diagnose plugin errors, missing dependencies, and broken ComfyUI environments without reinstalling.

Download free for Windows

Did this fix your issue?

Your answer helps prioritize verified ComfyUI repairs.

How to Fix CUDA Out of Memory Errors in ComfyUI

What the error looks like

Why it happens

VRAM requirements by model type

Step-by-step fixes

1. Lower resolution and batch size

2. Use ComfyUI memory management flags

3. Use quantized or fp16 models

4. Enable tiled VAE decode

5. Close other GPU applications

6. Use --force-fp16 for the entire pipeline

What not to do

How Wonderful Launcher can help

Source References

Table of Contents

How to Fix CUDA Out of Memory Errors in ComfyUI

What the error looks like

Why it happens

VRAM requirements by model type

Step-by-step fixes

1. Lower resolution and batch size

2. Use ComfyUI memory management flags

3. Use quantized or fp16 models

4. Enable tiled VAE decode

5. Close other GPU applications

6. Use --force-fp16 for the entire pipeline

What not to do

How Wonderful Launcher can help

Source References

Table of Contents