How to Fix CUDA Out of Memory Errors in ComfyUI
Fix ComfyUI CUDA out of memory (OOM) errors by adjusting generation settings, using memory-efficient options, and understanding VRAM limits.
Community Knowledge
This page is based on common ComfyUI troubleshooting patterns and has not been fully tested across all environments. Back up your environment before changing packages.
If ComfyUI crashes with torch.cuda.OutOfMemoryError: CUDA out of memory, your GPU does not have enough VRAM for the current operation.
This is not a broken installation — it means the generation settings, model size, or resolution exceed what your GPU can hold in memory at once. The good news is that most OOM errors can be fixed without buying a new GPU.
Fast answer
Try these in order:
- Lower your image resolution (e.g., from 1024x1024 to 768x768)
- Launch ComfyUI with
--lowvramor--novramflags - Use fp8 or fp16 model checkpoints instead of fp32
- Enable tiled VAE decoding in your workflow
What the error looks like
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 GiB.
GPU 0 has a total capacity of 8.00 GiB of which 512.00 MiB is free.
Including non-PyTorch memory, this process has 7.48 GiB memory in use.You may also see variations like:
RuntimeError: CUDA error: out of memoryError occurred when executing KSampler:
CUDA out of memory.The error typically occurs during sampling, VAE decoding, or model loading.
Why it happens
- Resolution too high: Larger images use exponentially more VRAM. A 2048x2048 image uses roughly 4x the VRAM of a 1024x1024 image
- Model too large for your GPU: SDXL models need more VRAM than SD 1.5 models. FLUX models need even more
- Multiple models loaded: Loading a checkpoint, ControlNet, IP-Adapter, and LoRA simultaneously adds up
- Batch size too high: Generating multiple images at once multiplies VRAM usage
- VAE decode at full resolution: The VAE decode step can spike VRAM even when sampling succeeded
- Other programs using GPU memory: Chrome, Discord, or other GPU-accelerated apps consume VRAM
VRAM requirements by model type
| Model | Minimum VRAM | Comfortable VRAM | Notes |
|---|---|---|---|
| SD 1.5 | 4 GB | 6 GB | Most compatible |
| SDXL | 6 GB | 8 GB | Default 1024x1024 |
| FLUX.1 | 8 GB (quantized) | 12 GB | fp8 recommended for 8GB cards |
| Video models | 12 GB | 24 GB | Highly variable |
Step-by-step fixes
1. Lower resolution and batch size
The fastest fix. Reduce your image dimensions:
- SD 1.5: try 512x512 or 512x768
- SDXL: try 768x768 or 832x1216
- FLUX: try 768x768 first
Set batch size to 1 if it is higher.
2. Use ComfyUI memory management flags
Launch ComfyUI with memory-saving arguments:
.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --lowvramAvailable flags:
| Flag | Effect | VRAM saved |
|---|---|---|
--lowvram | Moves model parts to CPU during generation | Significant |
--novram | Keeps almost everything on CPU, moves to GPU only when needed | Maximum, but much slower |
--cpu | Runs entirely on CPU | All GPU VRAM free, very slow |
--disable-smart-memory | Disables automatic memory management | Try if auto management causes issues |
3. Use quantized or fp16 models
Full fp32 models use twice the VRAM of fp16 models. fp8 models use even less:
- Download fp16 checkpoints when available
- For FLUX, use fp8 quantized checkpoints on 8 GB GPUs
- Some checkpoint sites label these as "fp16-fix" or "pruned"
4. Enable tiled VAE decode
If the OOM happens during VAE decode (the final step), add a VAE Decode (Tiled) node instead of the regular VAE Decode. Tiled decoding processes the image in smaller chunks.
5. Close other GPU applications
Check what else is using VRAM:
nvidia-smiClose browsers, games, video editors, or other applications that consume GPU memory before generating.
6. Use --force-fp16 for the entire pipeline
.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --force-fp16This forces all operations to use fp16 precision, roughly halving VRAM usage at the cost of minor quality differences.
What not to do
- Do not increase Windows virtual memory (pagefile) thinking it will fix GPU OOM — VRAM and system RAM are separate
- Do not reinstall ComfyUI — this is a resource limit, not a broken install
- Do not install random CUDA toolkits — the PyTorch wheel includes what it needs
- Do not ignore the specific numbers in the error message — they tell you exactly how much VRAM you have and how much was requested
How Wonderful Launcher can help
Wonderful Launcher can detect your GPU's VRAM capacity and help identify and suggest memory optimization settings. It also helps manage model formats and suggests compatible settings for your hardware.
Download Wonderful Launcher — it's free and can optimize your ComfyUI setup for your specific GPU.
Related errors
- ComfyUI GPU Compatibility
- Python Out of Memory in ComfyUI
- ComfyUI System Requirements
- Torch Not Compiled With CUDA Enabled
- ComfyUI Common Issues
Source References
You can fix it manually, or download Wonderful Launcher for Windows to diagnose plugin errors, missing dependencies, and broken ComfyUI environments without reinstalling.
Download free for WindowsDid this fix your issue?
Your answer helps prioritize verified ComfyUI repairs.