ComfyUI FramePack Guide: Generate Videos with Just 6GB VRAM
How to use FramePack in ComfyUI for low-VRAM video generation — setup, first-last frame workflows, and comparison of available custom nodes.
What is FramePack?
FramePack is a video generation technology developed by Dr. Lvmin Zhang's team (the creator of ControlNet) at Stanford University. Its breakthrough is lowering the VRAM requirement for video generation from 12+ GB to just 6 GB — making it accessible on consumer GPUs like the RTX 3060.
Key innovations:
| Feature | Description |
|---|---|
| Dynamic context compression | Key frames retain full detail (1536 markers), transitional frames are compressed (192 markers) |
| Drift-resistant sampling | Bidirectional memory prevents image drift and maintains motion continuity |
| Low VRAM | Generates 60-second videos on 6 GB VRAM |
| First + last frame control | Define start and end images, FramePack generates the motion between them |
Available ComfyUI Implementations
Three community plugins implement FramePack in ComfyUI:
| Plugin | Author | First-Last Frame | Recommended |
|---|---|---|---|
| ComfyUI-FramePackWrapper | Kijai | Yes | Yes — repackaged models, best compatibility |
| ComfyUI_RH_FramePack | HM-RunningHub | Yes | No — uses original repo structure, larger disk usage |
| TTP_Comfyui_FramePack_SE | TTPlanetPig | Yes | No — fork of above, same limitations |
We recommend the Kijai version. It uses repackaged model files compatible with other ComfyUI workflows, and has the most reliable compatibility.
Setup: Kijai FramePackWrapper
1. Install Required Plugins
Install these four custom nodes via ComfyUI Manager:
- ComfyUI-FramePackWrapper — may require Git install via Manager
- ComfyUI-KJNodes
- ComfyUI-VideoHelperSuite
- ComfyUI_essentials
2. Download Models
Diffusion Model (choose one):
| File | Precision | Size | VRAM | Download |
|---|---|---|---|---|
| FramePackI2V_HY_fp8_e4m3fn.safetensors | FP8 | 16.3 GB | Lower | HuggingFace |
| FramePackI2V_HY_bf16.safetensors | BF16 | 25.7 GB | Higher | HuggingFace |
Other required models:
| File | Location | Download |
|---|---|---|
| clip_l.safetensors | models/text_encoders/ | HuggingFace |
| llava_llama3_fp16.safetensors | models/text_encoders/ | HuggingFace |
| sigclip_vision_patch14_384.safetensors | models/clip_vision/ | HuggingFace |
| hunyuan_video_vae_bf16.safetensors | models/vae/ | HuggingFace |
3. File Structure
ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └── FramePackI2V_HY_fp8_e4m3fn.safetensors
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── llava_llama3_fp16.safetensors
│ ├── clip_vision/
│ │ └── sigclip_vision_patch14_384.safetensors
│ └── vae/
│ └── hunyuan_video_vae_bf16.safetensorsRunning the Workflow
First-Last Frame Video Generation
- Load FramePackModel → select your diffusion model
- DualCLIPLoader → load
clip_l.safetensorsandllava_llama3_fp16.safetensors - Load CLIP Vision → load
sigclip_vision_patch14_384.safetensors - Load VAE → load
hunyuan_video_vae_bf16.safetensors - CLIP Text Encoder → describe the motion you want
- Load Image (first frame) → your starting image
- Load Image (last frame) → your ending image (optional — bypass if not needed)
- FramePackSampler → set
total_second_length(e.g., 5 seconds) - Click Run (
Ctrl+Enter)
If you only want image-to-video without a last frame, bypass (Ctrl+B) the last frame input node and its connected processing nodes.
Writing Motion Prompts
FramePack works best with motion-focused prompts. The FramePack creator provides a useful pattern:
Describe subject first, then motion, then environment.
Good examples:
The girl dances gracefully, with clear movements, full of charmA cat jumps from the table to the floor, landing softlyCamera slowly pans across a mountain landscape at sunset
Tip: Prefer dynamic motions (dancing, jumping, running) over static ones (standing, sitting) for more impressive results.
Common Issues and Fixes
Widget disappears after loading workflow
- Update your ComfyUI frontend to version 1.16.9 or later
- This is a known frontend bug that affects FramePack workflows
Out of memory
- Use the FP8 model variant (16.3 GB vs 25.7 GB)
- Reduce
total_second_lengthto 2–3 seconds - Close other applications
Video has inconsistent motion or visual drift
- FramePack's bidirectional sampling should minimize drift, but very long videos (30+ seconds) may still show some
- Use first-last frame mode to anchor the start and end states
- Break long videos into shorter segments
Plugins not found in ComfyUI Manager
- ComfyUI-FramePackWrapper may not be registered in the Manager — install it via Git URL in the Manager's "Install via Git" option
Related Guides
- HunyuanVideo Guide — Full HunyuanVideo T2V setup
- Wan Video Guide — Alternative video generation models
- Install Custom Nodes — How to install plugins
ComfyUI HunyuanVideo Guide: Text-to-Video Generation Setup
How to set up and run Tencent's HunyuanVideo model in ComfyUI for text-to-video generation — model downloads, workflow setup, and optimization tips.
ComfyUI 3D Generation Guide: Create 3D Models with Hunyuan3D
How to generate 3D models from images in ComfyUI using Tencent's Hunyuan3D 2.0 — native workflow and Kijai wrapper setup, model downloads, and tips.
Wonderful Launcher Dokümanları