ComfyUI Depth ControlNet: Control Spatial Layout and Perspective
How to use Depth ControlNet in ComfyUI to preserve spatial relationships, perspective, and scene layout when generating AI images.
What is Depth ControlNet?
Depth ControlNet analyzes the distance between objects and the camera in your reference image, creating a depth map — a grayscale image where white areas are close and black areas are far. The AI uses this depth map to maintain the same spatial layout when generating a new image.
This is especially powerful for:
- Interior design — restyle a room while keeping the exact furniture layout
- Architecture — transform a building's style while preserving its structure
- Scene composition — maintain foreground/background relationships across style changes
- Product photography — keep depth-of-field relationships consistent
Prerequisites
Plugin Installation
ComfyUI doesn't include a depth preprocessor by default. You need the ComfyUI ControlNet Auxiliary Preprocessors plugin.
Install it via ComfyUI Manager (search for "ControlNet Auxiliary Preprocessors") or manually:
cd ComfyUI/custom_nodes
git clone https://github.com/Fannovel16/comfyui_controlnet_aux
cd comfyui_controlnet_aux
pip install -r requirements.txtRestart ComfyUI after installation.
Models
| Model | File | Download |
|---|---|---|
| SD1.5 checkpoint | dreamshaper_8.safetensors | Civitai |
| Depth ControlNet | control_v11f1p_sd15_depth.pth | HuggingFace |
File Placement
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── dreamshaper_8.safetensors
│ └── controlnet/
│ └── control_v11f1p_sd15_depth.pthBuilding the Workflow
- Load Image — load your reference photo
- Zoe-DepthMapPreprocessor — generates a depth map from the image (from the Auxiliary Preprocessors plugin)
- Preview Image — preview the depth map to verify quality
- Load Checkpoint — loads the SD1.5 model
- Load ControlNet Model — loads
control_v11f1p_sd15_depth.pth - Apply ControlNet (Advanced) — injects depth information into conditioning
- CLIP Text Encode (x2) — positive and negative prompts
- KSampler → VAE Decode → Save Image
Key Parameters
Depth Preprocessor
| Parameter | Recommended | Effect |
|---|---|---|
| resolution | 512 (general), 768+ (high detail) | Higher = more accurate depth map but slower processing |
The Zoe-DepthMapPreprocessor produces the best results for architecture and interior scenes. For outdoor landscapes, MiDaS or LeReS preprocessors can also work well. All are available in the Auxiliary Preprocessors plugin.
Apply ControlNet (Advanced)
| Parameter | Recommended | Effect |
|---|---|---|
| strength | 0.8–1.0 | How strictly spatial layout is preserved. Higher = more faithful to reference |
| start_percent | 0.0 | Start influence from the beginning |
| end_percent | 0.9–1.0 | Keep influence through most of sampling for consistent spatial structure |
Prompt Tips for Depth Control
Include spatial and quality keywords for best results:
Spatial terms: depth of field, perspective, spatial layout, foreground, background
Quality terms: professional, high quality, detailed, realistic
Style terms: describe your target style clearly — modern minimalist interior, cyberpunk cityscape, watercolor landscape
Use Case: Interior Restyling
One of the most popular Depth ControlNet applications is restyling a room:
- Take a photo of any room
- Extract its depth map with the Zoe preprocessor
- Write a prompt describing the new style:
modern scandinavian living room, white walls, wooden furniture, natural light, professional interior photography - Generate — the new image keeps the exact room layout while applying the target style
Common Issues and Fixes
Generated image has weak spatial sense
- Verify the depth map preview — it should clearly show near (white) and far (black) areas
- Increase
strengthto 0.9–1.0 - Increase sampling
stepsto 25–30
Depth map looks flat or inaccurate
- Use a higher
resolutionsetting on the preprocessor (768 or 1024) - Ensure the input image has clear depth — flat product shots won't produce good depth maps
- Try a different preprocessor (MiDaS instead of Zoe, or vice versa)
Details are getting lost
- Lower the
cfgvalue to 5–6 (too high cfg crushes subtle detail) - Add more specific detail keywords to your prompt
Wrong style but correct layout
- The depth map is working — refine your text prompts
- Check that negative prompts aren't conflicting with your desired style
Related Guides
- ControlNet Overview — All ControlNet types explained
- Canny ControlNet — Edge-based structure control
- OpenPose ControlNet — Human pose control
ComfyUI Canny ControlNet: Edge-Based Image Control Guide
How to use Canny ControlNet in ComfyUI to generate images that follow the edge structure of a reference photo — with parameter tuning tips and troubleshooting.
ComfyUI OpenPose ControlNet: Control Character Poses in AI Images
How to use OpenPose ControlNet in ComfyUI to generate images with precise human poses — from skeleton detection to pose-controlled generation.
Wonderful Launcher 文档