ComfyUI ControlNet Guide: Precise Control Over AI Image Generation

Learn what ControlNet is, how it works in ComfyUI, which control types exist, and how to set up your first ControlNet workflow step by step.

What is ControlNet?

ControlNet is a neural network that adds structural control to AI image generation. Instead of relying on text prompts alone, you provide a reference image — an edge map, a depth map, a skeleton pose — and ControlNet ensures the output follows that structure.

Think of it as giving the AI a blueprint. The text prompt says what to draw, and ControlNet says how to arrange it.

Why Use ControlNet?

Without ControlNet	With ControlNet
AI interprets your prompt freely — poses, compositions, and layouts are unpredictable	The output follows the exact structure of your reference image
Generating a specific pose may take dozens of attempts	One reference image produces the correct pose on the first try
No way to preserve spatial layout from an existing image	Depth maps, edges, and poses can all be extracted and reused

Control Types at a Glance

ControlNet supports many control methods. Each extracts different information from a reference image:

Line and Edge Controls

Type	Best For	Description
Canny	Precise structure	Detects detailed edges — great for architecture, mechanical designs, and any scene with clear outlines
Lineart	Illustrations	Cleaner line extraction than Canny, supports anime-style line art
SoftEdge	Soft compositions	Captures large contours only, giving the AI more creative freedom
MLSD	Architecture	Detects straight lines only — ideal for buildings and interior design
Scribble	Sketches	Works with rough hand-drawn doodles as input

Depth and Spatial Controls

Type	Best For	Description
Depth	Scene layout	Creates a depth map (white = close, black = far) to preserve spatial relationships
NormalMap	Surface detail	Controls surface texture and lighting direction
OpenPose	Human poses	Detects body skeleton keypoints to replicate poses

Style and Other Controls

Type	Best For	Description
Shuffle	Style transfer	Rearranges visual elements from the reference image
Tile	Upscaling	Enhances detail in blurry or low-resolution images
IP-Adapter	Face/style consistency	Maintains visual identity across generations
Inpaint	Partial editing	Modifies specific regions while keeping the rest unchanged

How ControlNet Works in ComfyUI

The workflow has three stages:

Preprocessing — A preprocessor node extracts control information from your reference image (e.g., the Canny node extracts edges, a Depth preprocessor generates a depth map)
Condition injection — The Apply ControlNet node combines the extracted features with your text prompt conditions
Sampling — The KSampler generates the image, respecting both the text description and the structural constraints

Setting Up Your First ControlNet Workflow

What You Need

Item	Where to Get It
A checkpoint model (e.g., DreamShaper 8)	Civitai
A ControlNet model matching your checkpoint version	HuggingFace — ControlNet v1.1
ComfyUI ControlNet Auxiliary Preprocessors plugin	GitHub

ControlNet models are version-specific. SD1.5 ControlNet models only work with SD1.5 checkpoints. SDXL ControlNet models only work with SDXL checkpoints. They are not interchangeable.

File Placement

ComfyUI/
├── models/
│   ├── checkpoints/
│   │   └── dreamshaper_8.safetensors
│   └── controlnet/
│       └── control_v11p_sd15_canny.pth
└── custom_nodes/
    └── comfyui_controlnet_aux/   (installed via Manager)

Building the Workflow

Load Checkpoint — loads your base model
Load Image — loads your reference photo
Preprocessor node (e.g., Canny) — extracts control features from the reference
Load ControlNet Model — loads the matching ControlNet model file
Apply ControlNet — merges extracted features into your prompt conditioning
CLIP Text Encode (x2) — positive and negative prompts
KSampler — generates the image
VAE Decode → Save Image — decodes and saves the result

Key Parameters

Parameter	Node	What It Does
strength	Apply ControlNet	How strictly the output follows the reference (0.0–1.0). Start at 0.8
start_percent	Apply ControlNet	When ControlNet influence begins during sampling (0.0 = from the start)
end_percent	Apply ControlNet	When ControlNet influence ends (1.0 = until the end)

Lowering end_percent to 0.7–0.8 often produces more natural results — the AI gets structural guidance early, then adds its own detail in later steps.

Verify the ControlNet model matches your checkpoint version (SD1.5 with SD1.5, SDXL with SDXL)
Increase the strength value
Make sure the preprocessor output actually contains meaningful features (preview it)

Output follows structure too rigidly

Lower strength to 0.5–0.7
Set end_percent to 0.7 so the AI has freedom in later sampling steps

Deformed or broken anatomy

Use OpenPose ControlNet alongside your other ControlNet to enforce correct body structure
Add anatomy-related negative prompts: deformed, extra limbs, bad anatomy

"ControlNet model not found" error

Check that the .pth or .safetensors file is in ComfyUI/models/controlnet/
Restart ComfyUI after adding new model files

Preprocessor node missing

Install the ComfyUI ControlNet Auxiliary Preprocessors plugin
Restart ComfyUI after installation

Ready to try specific ControlNet types? Check out:

Canny ControlNet — Edge-based structure control
Depth ControlNet — Spatial depth and perspective
OpenPose ControlNet — Human pose control