LogoWonderful Launcher
  • Product
  • Pricing
  • Solutions
  • Docs
  • Blog
  • Get Help
  • Download

How to Fix CUDA Out of Memory Errors in ComfyUI

Needs verification

Fix ComfyUI CUDA out of memory (OOM) errors by adjusting generation settings, using memory-efficient options, and understanding VRAM limits.

Community Knowledge

This page is based on common ComfyUI troubleshooting patterns and has not been fully tested across all environments. Back up your environment before changing packages.

If ComfyUI crashes with torch.cuda.OutOfMemoryError: CUDA out of memory, your GPU does not have enough VRAM for the current operation.

This is not a broken installation — it means the generation settings, model size, or resolution exceed what your GPU can hold in memory at once. The good news is that most OOM errors can be fixed without buying a new GPU.

Fast answer

Try these in order:

  1. Lower your image resolution (e.g., from 1024x1024 to 768x768)
  2. Launch ComfyUI with --lowvram or --novram flags
  3. Use fp8 or fp16 model checkpoints instead of fp32
  4. Enable tiled VAE decoding in your workflow

What the error looks like

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 GiB.
GPU 0 has a total capacity of 8.00 GiB of which 512.00 MiB is free.
Including non-PyTorch memory, this process has 7.48 GiB memory in use.

You may also see variations like:

RuntimeError: CUDA error: out of memory
Error occurred when executing KSampler:
CUDA out of memory.

The error typically occurs during sampling, VAE decoding, or model loading.

Why it happens

  • Resolution too high: Larger images use exponentially more VRAM. A 2048x2048 image uses roughly 4x the VRAM of a 1024x1024 image
  • Model too large for your GPU: SDXL models need more VRAM than SD 1.5 models. FLUX models need even more
  • Multiple models loaded: Loading a checkpoint, ControlNet, IP-Adapter, and LoRA simultaneously adds up
  • Batch size too high: Generating multiple images at once multiplies VRAM usage
  • VAE decode at full resolution: The VAE decode step can spike VRAM even when sampling succeeded
  • Other programs using GPU memory: Chrome, Discord, or other GPU-accelerated apps consume VRAM

VRAM requirements by model type

ModelMinimum VRAMComfortable VRAMNotes
SD 1.54 GB6 GBMost compatible
SDXL6 GB8 GBDefault 1024x1024
FLUX.18 GB (quantized)12 GBfp8 recommended for 8GB cards
Video models12 GB24 GBHighly variable

Step-by-step fixes

1. Lower resolution and batch size

The fastest fix. Reduce your image dimensions:

  • SD 1.5: try 512x512 or 512x768
  • SDXL: try 768x768 or 832x1216
  • FLUX: try 768x768 first

Set batch size to 1 if it is higher.

2. Use ComfyUI memory management flags

Launch ComfyUI with memory-saving arguments:

.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --lowvram

Available flags:

FlagEffectVRAM saved
--lowvramMoves model parts to CPU during generationSignificant
--novramKeeps almost everything on CPU, moves to GPU only when neededMaximum, but much slower
--cpuRuns entirely on CPUAll GPU VRAM free, very slow
--disable-smart-memoryDisables automatic memory managementTry if auto management causes issues

3. Use quantized or fp16 models

Full fp32 models use twice the VRAM of fp16 models. fp8 models use even less:

  • Download fp16 checkpoints when available
  • For FLUX, use fp8 quantized checkpoints on 8 GB GPUs
  • Some checkpoint sites label these as "fp16-fix" or "pruned"

4. Enable tiled VAE decode

If the OOM happens during VAE decode (the final step), add a VAE Decode (Tiled) node instead of the regular VAE Decode. Tiled decoding processes the image in smaller chunks.

5. Close other GPU applications

Check what else is using VRAM:

nvidia-smi

Close browsers, games, video editors, or other applications that consume GPU memory before generating.

6. Use --force-fp16 for the entire pipeline

.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --force-fp16

This forces all operations to use fp16 precision, roughly halving VRAM usage at the cost of minor quality differences.

What not to do

  • Do not increase Windows virtual memory (pagefile) thinking it will fix GPU OOM — VRAM and system RAM are separate
  • Do not reinstall ComfyUI — this is a resource limit, not a broken install
  • Do not install random CUDA toolkits — the PyTorch wheel includes what it needs
  • Do not ignore the specific numbers in the error message — they tell you exactly how much VRAM you have and how much was requested

How Wonderful Launcher can help

Wonderful Launcher can detect your GPU's VRAM capacity and help identify and suggest memory optimization settings. It also helps manage model formats and suggests compatible settings for your hardware.

Download Wonderful Launcher — it's free and can optimize your ComfyUI setup for your specific GPU.

Related errors

  • ComfyUI GPU Compatibility
  • Python Out of Memory in ComfyUI
  • ComfyUI System Requirements
  • Torch Not Compiled With CUDA Enabled
  • ComfyUI Common Issues

Source References

  • PyTorch CUDA Semantics
  • ComfyUI CLI Arguments
  • NVIDIA CUDA Documentation

You can fix it manually, or download Wonderful Launcher for Windows to diagnose plugin errors, missing dependencies, and broken ComfyUI environments without reinstalling.

Download free for Windows

Did this fix your issue?

Your answer helps prioritize verified ComfyUI repairs.

Table of Contents

What the error looks like
Why it happens
VRAM requirements by model type
Step-by-step fixes
1. Lower resolution and batch size
2. Use ComfyUI memory management flags
3. Use quantized or fp16 models
4. Enable tiled VAE decode
5. Close other GPU applications
6. Use --force-fp16 for the entire pipeline
What not to do
How Wonderful Launcher can help
Related errors
Source References