Tensor Cores accelerate typical deep learning workloads such as feed-forward and convolutional layers. However, your workloads must be using mixed precision to leverage Tensor Cores. Click here to read more about Automatic Mixed Precision for Deep learning.
If you’re using TensorFlow and PyTorch frameworks, you can leverage automatic mixed precision features available in those frameworks to activate Tensor Cores. Depending on your scenario, activating Tensor Cores can be as simple as adding just two lines of code.
In deep learning workloads, there will be two types of routines: math-limited and bandwidth-limited.
The performance of math-limited routines is limited by calculation, and the performance of bandwidth-limited routines is constrained by memory bandwidth available. A few examples of math-limited categories are recurrent layers, fully-connected layers, convolutional layers, etc. Math-limited deep learning routines can benefit from Tensor Cores. So, we recommend you to use parameters that can enable Tensor Core operations.
For more in-depth details on how you can enable Tensor Cores and optimize your deep learning layers for maximum execution efficiency, read here.
You can use tools provided by NVIDIA, such as Nsight Compute and Nvprof, to show mixed precision use in your deep learning models. You can read more about that here.
Tensor cores are specialized processing units that can perform matrix multiply and accumulate results in a single operation. Such fast-processing is beneficial for deep learning training and inferencing. Tensor Cores are available in NVIDIA Volta and Turing GPUs. FYI, E2E GPU Cloud features Volta architecture-based NVIDIA V100 and Turing architecture-based NVIDIA T4 GPU Instances.
NVIDIA Tensor Cores can improve the deep learning throughput by 8x. NVIDIA provides several tools such as NGC Containers so that you can optimize your deep learning workloads and leverage the execution speed and precision provided by NVIDIA Tensor Cores.
In general, you can use NVIDIA V100 for model training and NVIDIA T4 for inferencing.
Click here to know more about E2E GPU Cloud powered by NVIDIA T4 and NVIDIA V100.