Stable Diffusion is a milestone in Generative Models serving the masses with the quality of images produced, its speed and relatively low computation/memory resources requirement. In this post we are going to get an overview of Stable Diffusion and the steps required to implement webUI on E2E Cloud.
Scope of the Content:
- Overview of key components
- Launching a GPU on E2E Cloud
- Installation and Running of Stable Diffusion
- Generating a Sample image from text prompt
- Bonus Tip
Two major ways to use the Stable Diffusion are:
- Text-to-image
- Text+image to image
Overview of key components:-
Let's consider the text2img case and see the various components and their functions. A text is given as input which passes through a text Encoder (Use CLIPText). The Text Encoder produces Token embeddings in latent space representing the features of the text.
These Token embeddings and a random noise is passed through Image Information Creator (Based on UNet + Scheduler). This is the component where the diffusion process takes place. The Image Information Creator produces a processed image tensor in latent space which gets fed to Image Decoder (Based on Autoencoder Decoder) and a high resolution image is produced.
The Illustrated Stable Diffusion – Jay Alammar
A comprehensive overview from original research paper “High-Resolution Image Synthesis with Latent Diffusion Models”
The Illustrated Stable Diffusion – Jay Alammar
ClipText for text encoding.
Input: text.
Output: 77 token embeddings vectors, each in 768 dimensions.
UNet + Scheduler to gradually process/diffuse information in the information (latent) space.
Input: text embeddings and noise.
Output: A processed information array
Autoencoder Decoder that paints the final image using the processed information array.
Input: The processed information array (dimensions: (4,64,64))
Output: The resulting image (dimensions: (3, 512, 512)
Launching a GPU on E2E Cloud :-
- Go to myaccount.e2enetwork.com and select GPU under the compute section.
- Go to GPU under create compute node and select a NVIDIA-A100,40GB, select O.S you want to use.
- Click create and select the appropriate plan and click create again.
- On following screen you can choose number of nodes, security tool i.e. BitNinja(Recommended), add SSH key(Recommended) SSH Keys Management — E2E Networks documentation, set Network settings like Use VPC, Reserve IPv4,IPv6.
- Click on create my node.
Congratulations! You have created a GPU node on E2E Cloud successfully. Public and Private IP along with credentials will be sent to your email. If you need any help or any doubt please visit https://docs.e2enetworks.com/
Installation and Running of Stable Diffusion:-
Access the node created by you by SSH:-
ssh root@your_public_ip
This will prompt for password(if you haven't disabled password login). Type your password and press enter.
Required Dependency:-
1. Python 3.10.6 and Git:
- Windows: download and run installers for Python 3.10.6 (webpage, exe, or win7 version) and git (webpage)
- Linux (Debian-based): sudo apt install wget git python3 python3-venv
2. Code from this repository using git: git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
3. The Stable Diffusion model checkpoint, a file with .ckpt extension, needs to be downloaded and placed in the models/Stable-diffusion directory.
Use :-
wget “https://drive.yerf.org/wl/?id=EBfTrmcCCUAGaQBXVIj5lJmEhjoP1tgl&mode=grid&download=1”
Installation on Windows:-
Run webui-user.bat from Windows Explorer as a normal, non-administrator, user.
Installation on Linux:-
To install in /home/$(whoami)/stable-diffusion-webui/, run:
Use:-
bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)
This will install and launch the stable-diffusion-webUI which is running on http://127.0.0.1:7860
Create a SSH tunnel to access the webUI on your local machine:-
ssh -L 7860:localhost:7860 username@your_public_ip
Go to browser in your local machine and visit http://127.0.0.1:7860
Generating a Sample image from text prompt:-
Bonus Tip:-
Please visit https://lexica.art/ and search for an image that interests you. Among the search results, find the appropriate result as per your requirement and click on it. It will show the prompt used to create that image and parameters used.
Use the same prompt in your stable-diffusion-webUI. The results may be different but after a few iterations on this, you will learn using effective prompts. It's a super easy and fun process!!
E2E Networks is the leading accelerated cloud which provides the best GPU at affordable price. You can contact us for your free trial: sales@e2enetworks.com