Synthesize novel view of complex scenes using NeRF and E2E CloudGPUs

March 16, 2023

Tags

Advancements in 2D image-recognition tasks such as classifications, detections, and instance segmentations, ushered in the deep learning era. Deep-learning based computer vision research has shifted towards 3D computer vision problems.

As the techniques mature, one of the most notable being synthesizes new views of an object from images and reconstructs its 3D shape.

In recent years, a completely new direction has emerged, namely Neural Radiance Fields (NeRF). An overview of the original NeRF concept is presented in this article.

View Synthesis with Neural Radiance Fields (NeRF):

Using sparse input views, 3D scenes are synthesized by optimizing an underlying continuous volumetric scene function. A static scene is represented as a continuous 5D function that outputs the radiance emitted in each direction (θ, φ) at each point (x, y, z) in space, and a density at each point. The method optimizes a deeply-connected neural network to represent this function by regressing from a single 5D coordinate (x, y, z, θ, φ) to a single volume density and view-dependent RGB color.

‍

‍

The NeRF presents a method that optimizes a continuous 5D neural radiance field representation (volume density and view-dependent color at any continuous location) of a scene from a set of input images. We use techniques from volume rendering to accumulate samples of this scene representation along the rays to render this scene from multiple viewpoints. Here we visualize the set of 100 input views of the synthetic Drums scene randomly captured on a surrounding hemisphere, and we show two novel views rendered from our optimized NeRF representation.

The key concepts involved in NeRF are:-

Neural Radiance Field Scene Representation: In this, a continuous scene is represented as a 5D vector-valued function whose input is a 3D location x = (x, y, z) and 2D viewing direction (θ, φ), and whose output direction are as a 3D Cartesian unit vector d. This continuous 5D scene representation is approximated with an MLP network FΘ : (x, d) → (c, σ) and optimize its weights Θ to map from each input 5D coordinate to its corresponding volume density and directional emitted color.

‍

‍

Fig. 2: An overview of our neural radiance field scene representation and differentiable rendering procedure. We synthesize images by sampling 5D coordinates (location and viewing direction) along camera rays (a) feeding those locations into an MLP to produce a color and volume density, (b) and using volume rendering techniques to composite these values into an image. (c) This rendering function is differentiable, so we can optimize our scene representation by minimizing the residual between synthesized and ground truth observed images.

Optimization:

There are two techniques to improve NeRF in complex scene synthesis:

Positional Encoding: This is a particular encoding function that maps inputs to a higher dimensional encoding space by using a high frequency function.

Hierarchical Volume Sampling: It is done in two steps first by training a coarse network using standard sampling. Outputs of the coarse network are fed to a refined network which aims to solve the relevant parts of the volume to increase the training efficiency.

Steps to synthesize the 3D view on E2E Cloud:

Launch A100 GPU on E2E Cloud. If need any help to launch the node please follow this manual - https://docs.e2enetworks.com/computes/nodes/launchnode.html#how-to-launch-nodes-from-myaccount-portal

Note:Implementing NeRF has high computational requirements and NVIDIA A100 is a recommended hardware solution for it.

Follow the following commands in your terminal:

ssh root@your_public_ip

apt update

apt upgrade

#Install Anaconda

wget https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh

bash Anaconda3-2022.10-Linux-x86_64.sh

bash

git clone https://github.com/bmild/nerf.git

‍

conda env create -f environment.yml

conda activate nerf

bash download_example_data.sh

python run_nerf.py --config config_fern.txt

tensorboard --logdir=logs/summaries --port=6006

Open port 6006 pn iptable:-

sudo iptables -L

iptables-save > IPtablesbackup.txt

#Open port only for a specific ip

sudo iptables -A INPUT -p tcp -s your_server_ip --dport xxxx -j ACCEPT

#Save the Iptable Rule

sudo /etc/init.d/iptables-persistent save

sudo /etc/init.d/iptables-persistent reload

#start training the model for fern example

python run_nerf.py --config config_fern.txt

#Use

Public_ip:6006 in your localmachine

This will display tensorboard in which you can observe training of your model :-

‍

‍

after 200k iterations (about 15 hours), you should get a video like this at logs/fern_test/fern_test_spiral_200000_rgb.mp4:

‍

‍

Closing Thought: In this article we look, at the fundamental concept behind NeRF and its implementation on E2E Cloud. We encourage the readers to try this model on a free GPU Trial on E2E Cloud. To avail your free credits please contact sales@e2enetworks.com

References:

https://arxiv.org/pdf/2003.08934.pdf

https://github.com/bmild/nerf

Sign up for Free Trial

Latest Blogs

A vector illustration of a tech city using latest cloud technologies & infrastructure