ComfyUI Flux Kontext Guide: AI Image Editing with Text Instructions

How to use FLUX.1 Kontext in ComfyUI for context-aware image editing — change objects, transfer styles, edit text in images, and maintain character consistency.

What is Flux Kontext?

FLUX.1 Kontext is a generative model designed for text-driven image editing. Unlike text-to-image models, Kontext understands both the input image and your text instructions, allowing precise edits like changing object colors, swapping backgrounds, transferring styles, and editing text in images — all while preserving the parts you don't mention.

Model Versions

Version	Availability	Quality
Kontext Pro/Max	API only	Best results, handles simple prompts well
Kontext Dev	Open-weight, downloadable	Good results, needs more specific prompts

The Dev version is non-commercial by default. Commercial licenses are available from Black Forest Labs.

Hardware Requirements

Model Version	VRAM
Original (bf16)	24 GB+
FP8 Scaled	12–16 GB
GGUF (Q4)	8 GB+
Nunchaku (4-bit)	8 GB+

Setup

Kontext Dev shares CLIP and VAE files with the Flux model family. If you already have those, you only need the Kontext diffusion model.

Model Downloads

Diffusion Model (choose one):

Version	File	Location	Download
FP8 (recommended)	flux1-dev-kontext_fp8_scaled.safetensors	`models/diffusion_models/`	HuggingFace
Original	flux1-kontext-dev.safetensors	`models/diffusion_models/`	HuggingFace
GGUF	flux1-kontext-dev-Q4_K_M.gguf	`models/unet/`	HuggingFace

Shared Flux components:

File	Location	Download
clip_l.safetensors	`models/text_encoders/`	HuggingFace
t5xxl_fp8_e4m3fn_scaled.safetensors	`models/text_encoders/`	HuggingFace
ae.safetensors	`models/vae/`	HuggingFace

Basic Workflow

Load Diffusion Model → Kontext model
DualCLIPLoader → clip_l + t5xxl
Load VAE → ae.safetensors
Load Image → the image you want to edit
CLIP Text Encode → your editing instruction (English only)
KSampler → generates the edited image
VAE Decode → Save Image

Kontext has a 512 token prompt limit. Keep instructions focused and concise.

What Can Kontext Edit?

Object Modification

Simple, direct changes to specific elements:

Change the car color to red

Replace the flowers with sunflowers

Add a small cat on the table

Style Transfer

Convert the image to a different artistic style:

Convert to oil painting with visible brushstrokes, thick paint
texture, and rich color depth while maintaining the original
composition

Background Changes

Swap the environment while preserving the subject:

Change the background to a beach while keeping the person in the
exact same position, scale, and pose

Character Consistency Editing

Modify character attributes while preserving identity:

Change her clothing to a red dress while preserving her exact
facial features, hairstyle, and expression

Text Editing in Images

Modify text that appears in the image:

Replace 'OPEN' with 'CLOSED' while maintaining the same font
style and color

Prompt Writing Tips

Be Specific, Not Vague

Bad	Good
"Make it better"	"Increase lighting brightness and add warm tones"
"Make it artistic"	"Convert to watercolor painting style"
"Change her"	"Change the woman with black hair's dress to blue"

Explicitly Preserve What Matters

Always state what should stay the same:

Change the background to a forest while maintaining the same
camera angle, subject position, and lighting

Name Subjects Directly

Bad	Good
"Change her outfit"	"Change the woman with short black hair's outfit"
"Remove it"	"Remove the red car from the background"

Use Quotes for Text Edits

Replace 'joy' with 'BFL'

Choose Verbs Carefully

Verb	Strength	Use When
Change	Moderate	Modifying specific elements
Replace	Targeted	Swapping one thing for another
Convert/Transform	Strong	Full style changes
Add	Additive	Inserting new elements
Remove	Subtractive	Deleting unwanted elements

Multi-Round Editing

Kontext supports iterative editing — edit the output of one generation as input for the next. Two approaches:

Load Image (from output) — load the previous result and re-run with a new prompt
Group Nodes — ComfyUI added a quick "Edit" button for chaining Kontext edits in one workflow

For complex transformations, break them into steps:

First edit: change the environment
Second edit: modify clothing
Third edit: adjust lighting

Multiple Image Input

You can reference multiple images by:

Image Stitch — combine images into one and input as a single reference (better results)
ReferenceLatent chaining — encode images separately and chain conditions (may mix features)

For stitched input, make the main reference image larger in proportion.

svdq-int4_r32-flux.1-kontext-dev.safetensors

Common Issues and Fixes

Image doesn't change at all

The Dev version needs more specific prompts than Pro/Max
Be more explicit about what to change: "Change the dress color from blue to red" instead of "Change the color"
Try a different seed

Identity changes too much during character edits

Add preservation instructions: "while preserving exact facial features, eye color, and expression"
Use "Change the clothes to..." instead of "Transform the person into..."

Composition shifts unexpectedly

Add: "keeping the person in the exact same position, scale, and pose"
Specify: "Maintain identical camera angle, framing, and perspective"

Style transfer loses important details

Be more specific about style characteristics
Add: "while preserving all scene details and object positions"

Flux Guide — Flux text-to-image generation
Inpainting Guide — Alternative approach for targeted edits
Image to Image — Transform entire images