In this article, we will look at what ControlNet is, its functions, and its implementation on E2E Cloud
ControlNet is a neural network structure to control diffusion models by adding extra conditions. The implications of this new method allow creative designers to communicate efficiently with diffusion models, and utilize more intuitive input forms, like hand-drawn features, as opposed to just text prompts.
It copies the weight of neural network blocks into a ‘locked’ copy and a ‘trainable’ copy. The ‘trainable’ one learns your condition. The ‘locked’ one preserves your model.Thanks to this, training with a small dataset of image pairs will not disturb the production-ready diffusion models. The ‘zero convolution’ is 1×1. Convolution with both the weights and bias initialized as zeros. Before training, all zero convolution’s output is zero, and ControlNet will not cause any distortion. No layer is trained from scratch. You are still fine-tuning. Your original model is safe.This allows training on small-scale or on even personal devices. This is well-suited to merge/replace/offsetting of models/weights/blocks/layers.
By repeating the above simple structure 14 times, we can control stable diffusion.
How to use ControlNet on E2E Cloud?
This tutorial is meant for Ubuntu O.S but you can appropriately adjust the commands for other O.S.
Launch A100 GPU on E2E Cloud. If need any help to launch the node please follow this manual - https://docs.e2enetworks.com/computes/nodes/launchnode.html#how-to-launch-nodes-from-myaccount-portal
Follow the following commands in your terminal-
wget https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh
bash Anaconda3-2022.10-Linux-x86_64.sh
bash
git clone https://github.com/lllyasviel/ControlNet.git
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_canny.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_normal.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_depth.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_hed.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_openpose.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_mlsd.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_scribble.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_seg.pth
cd ../annotator/ckpts
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/body_pose_model.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/dpt_hybrid-midas-501f0c75.pt
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/hand_pose_model.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/mlsd_large_512_fp32.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/network-bsds500.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/mlsd_tiny_512_fp32.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/upernet_global_small.pth
cd ../../
Congratulations! you are done Installing the requirements and downloading the models. Now you can start using the various Gradio apps-
ControlNet with Canny Edge
Stable Diffusion 1.5 + ControlNet (using simple Canny edge detection)
create ssh tunnel and access the app in your browser:-
The Gradio app allows you to change the Canny edge thresholds. Just try it for more details.
ControlNet with HED Boundary
Stable Diffusion 1.5 + ControlNet (using soft HED Boundary)
create ssh tunnel and access the app in your browser:-
The soft HED Boundary will preserve many details in input images, making this app suitable for recoloring and stylizing. Just try it for more details
ControlNet with User Scribbles
Stable Diffusion 1.5 + ControlNet (using Scribbles)
Note that the UI is based on Gradio, and Gradio is somewhat difficult to customize. Right now you need to draw scribbles outside the UI (using your favorite drawing software, for example, MS Paint) and then import the scribble image to Gradio.
ControlNet with Human Pose
Stable Diffusion 1.5 + ControlNet (using human pose)
Apparently, this model deserves a better UI to directly manipulate pose skeletons. However, again, Gradio is somewhat difficult to customize. Right now you need to input an image and then the Openpose will detect the pose for you.
There are other Gradio apps as well, that you can try by following the same commands.
Closing Thought:- In this article we look at the fundamental concept behind ControlNet and its implementation on E2E Cloud. We encourage the readers to try these models on a free GPU Trial on E2E Cloud. For getting your free credits please contact :- sales@e2enetworks.com
References:-
https://arxiv.org/pdf/2302.05543.pdf
https://github.com/lllyasviel/ControlNet