Step-by-Step Guide on How to Synthesize text/image to video Using Stable Video Diffusion on E2E Cloud

November 27, 2023

Surya Remanan

Getting Everything Set Up and Running!

On E2E Networks, when you log in to “My Account”, your dashboard will look like this:

‍

‍

Click on “Compute” on the left banner, and a drop down menu will appear:

‍

Click on “GPU”. This will lead you to the list of available GPU cards. It is given below:

‍

I went with the NVIDIA A100 Ubuntu-22.04. You can choose the OS of your choice by clicking on the dropdown.

There will be several plans for you to choose from. I selected NVIDIA - 4xA100-80GB Machine and hit Create.

On hitting Create, you will get several options.

I selected the Hourly Billed option and hit Create.

Then you will be directed to this page:

‍

You can choose to enable Backup (totally up to you) and then create your node. After a while, your node will look something like this:

You will be assigned a different pair of IP addresses. In order to add your SSH keys to this machine, first, generate the pair of SSH keys if you haven’t.

You can generate public and private SSH keys using the command


ssh-keygen‍

Use this command on your Powershell if you are using windows, or else on your terminal if you are using MacOS or Linux.

You can refer to this video for more details.

Then, you can add the public SSH keys by going to the Node Security Tab like this.

Now you are good to go. Now you can easily login to the machine via your local system.

In order to access the remote machine using VS Code installed on your local machine, you will need to install an extension called Remote-SSH. It will only take a few seconds to install.

To connect to the remote machine, click on the bottom left blue area.

On clicking on that button,click on “Connect to Host”

Next, you will see this, Add a new SSH Host….

Now, login to the remote machine using the ssh command:

ssh root@IP_address

Hit Enter and you will be logged into the remote machine via VS Code.

On welcome page of VS Code, click on ‘Open Folder’

The default folder will be ‘/root/’, change it to ‘/mnt/’

Then open new Terminal like this:

Model 1: Generate Image from text prompt

We will be Using Stable diffusion as our base model to generate a high Quality image. For that purpose, you need to be ready with a descriptive prompt as we go along the process of installing the dependencies.

Steps to be followed

First step is to clone the stable diffusion XL repository, cd into it and create a conda virtual environment where we will be installing all the dependencies.


git clone https://github.com/lllyasviel/Fooocus.git
cd Fooocus
conda env create -f environment.yaml
conda activate fooocus
pip install -r requirements_versions.txt

Then we activate the conda environment, and run the gradio app.


conda activate fooocus
python entry_with_update.py --listen

After all the processes are done, a gradio interface will open like this:

Since its Christmas season, I decided to put in a prompt: “ santa claus with gifts coming on a sleigh “

On clicking Generate, I got this Image:

Model 2: Generate Video from image

Now it's time to generate video out of the above image. We will be using Stability AI’s generative-model repo.

Steps to be followed

Clone the repository and change the directory into it.


git clone https://github.com/Stability-AI/generative-models.git
cd generative-models

Install the required dependencies:


python3 -m venv .pt2
source .pt2/bin/activate
pip3 install -r requirements/pt2.txt
pip3 install .

Install sdata for training:


pip3 install -e git+https://github.com/Stability-AI/datapipelines.git@main#egg=sdata

Set up packaging:


pip install hatch
hatch build -t wheel

Make a directory called “checkpoints” and download model weights into it:


mkdir checkpoints

Download model weights:

wget -O checkpoints/svd_xt.safetensors https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt/resolve/main/svd_xt.safetensors?download=true wget -O checkpoints/svd_xt_image_decoder.safetensors https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt/resolve/main/svd_xt_image_decoder.safetensors?download=true wget -O checkpoints/svd.safetensors https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors?download=true

wget -O checkpoints/svd_image_decoder.safetensors https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd_image_decoder.safetensors?download=true

Copy video_sampling.py to the main directory.


cp scripts/demo/video_sampling.py video_sampling.py

Run the following command to launch the web app:


streamlit run video_sampling.py --server.port=80

And then this UI will turn up:

‍

After we insert the image, we will get the following output:

Here is another example:

Sign up for Free Trial

Latest Blogs

A vector illustration of a tech city using latest cloud technologies & infrastructure

Step-by-Step Guide on How to Synthesize text/image to video Using Stable Video Diffusion on E2E Cloud

November 27, 2023

Surya Remanan

Getting Everything Set Up and Running!

On E2E Networks, when you log in to “My Account”, your dashboard will look like this:

‍

‍

Click on “Compute” on the left banner, and a drop down menu will appear:

‍

Click on “GPU”. This will lead you to the list of available GPU cards. It is given below:

‍

I went with the NVIDIA A100 Ubuntu-22.04. You can choose the OS of your choice by clicking on the dropdown.

There will be several plans for you to choose from. I selected NVIDIA - 4xA100-80GB Machine and hit Create.

On hitting Create, you will get several options.

I selected the Hourly Billed option and hit Create.

Then you will be directed to this page:

‍

You can choose to enable Backup (totally up to you) and then create your node. After a while, your node will look something like this:

You will be assigned a different pair of IP addresses. In order to add your SSH keys to this machine, first, generate the pair of SSH keys if you haven’t.

You can generate public and private SSH keys using the command


ssh-keygen‍

Use this command on your Powershell if you are using windows, or else on your terminal if you are using MacOS or Linux.

You can refer to this video for more details.

Then, you can add the public SSH keys by going to the Node Security Tab like this.

Now you are good to go. Now you can easily login to the machine via your local system.

In order to access the remote machine using VS Code installed on your local machine, you will need to install an extension called Remote-SSH. It will only take a few seconds to install.

To connect to the remote machine, click on the bottom left blue area.

On clicking on that button,click on “Connect to Host”

Next, you will see this, Add a new SSH Host….

Now, login to the remote machine using the ssh command:

ssh root@IP_address

Hit Enter and you will be logged into the remote machine via VS Code.

On welcome page of VS Code, click on ‘Open Folder’

The default folder will be ‘/root/’, change it to ‘/mnt/’

Then open new Terminal like this:

Model 1: Generate Image from text prompt

Steps to be followed

First step is to clone the stable diffusion XL repository, cd into it and create a conda virtual environment where we will be installing all the dependencies.


git clone https://github.com/lllyasviel/Fooocus.git
cd Fooocus
conda env create -f environment.yaml
conda activate fooocus
pip install -r requirements_versions.txt

Then we activate the conda environment, and run the gradio app.


conda activate fooocus
python entry_with_update.py --listen

After all the processes are done, a gradio interface will open like this:

Since its Christmas season, I decided to put in a prompt: “ santa claus with gifts coming on a sleigh “

On clicking Generate, I got this Image:

Model 2: Generate Video from image

Now it's time to generate video out of the above image. We will be using Stability AI’s generative-model repo.

Steps to be followed

Clone the repository and change the directory into it.


git clone https://github.com/Stability-AI/generative-models.git
cd generative-models

Install the required dependencies:


python3 -m venv .pt2
source .pt2/bin/activate
pip3 install -r requirements/pt2.txt
pip3 install .

Install sdata for training:


pip3 install -e git+https://github.com/Stability-AI/datapipelines.git@main#egg=sdata

Set up packaging:


pip install hatch
hatch build -t wheel

Make a directory called “checkpoints” and download model weights into it:


mkdir checkpoints

Download model weights:

wget -O checkpoints/svd_image_decoder.safetensors https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd_image_decoder.safetensors?download=true

Copy video_sampling.py to the main directory.


cp scripts/demo/video_sampling.py video_sampling.py

Run the following command to launch the web app:


streamlit run video_sampling.py --server.port=80

And then this UI will turn up:

‍

After we insert the image, we will get the following output:

Here is another example:

Sign up for Free Trial

Latest Blogs

Step-by-Step Guide on How to Synthesize text/image to video Using Stable Video Diffusion on E2E Cloud

Table of Contents

Getting Everything Set Up and Running!

Model 1: Generate Image from text prompt

Steps to be followed

Model 2: Generate Video from image

Steps to be followed

Step-by-Step Guide on How to Synthesize text/image to video Using Stable Video Diffusion on E2E Cloud

Table of Contents

Getting Everything Set Up and Running!

Model 1: Generate Image from text prompt

Steps to be followed

Model 2: Generate Video from image

Steps to be followed

9 Cloud Computing Trends Shaping India’s Digital Future in 2025

LoRA fine-tune Gemma 7B Using TIR with 10 Easy Steps

How Does RAG Improve the Accuracy of LLM Responses?

Top 10 Cloud GPU Providers in 2025

What is Retrieval-Augmented Generation (RAG)?

AI Inference vs Training: Understanding Key Differences

Sovereign Cloud: India's Key to Digital Independence in the AI Age

E2E Sovereign Cloud Platform: Revolutionizing Cloud Sovereignty

Top 8 Generative AI Applications in 2025

A Comparison between TIR Containerized VMs vs Traditional VMs