At the Graphics Technology Conference (GTC), NVIDIA unveiled the latest AI accelerator – the H100 Tensor Core GPU. It is the successor to the A100 GPU, launched two years ago, which was a resounding success. With 9x faster AI training and 30x faster inference, it’s more than just an incremental upgrade to A100.
Here are ten facts about NVIDIA’s next-gen GPU:
1) The GPU is based on the Hopper architecture, a successor to the Ampere architecture that powers the A100 and A30 GPUs. A100 GPUs are available through NVIDIA’s DGX A100 and EGX A100 platforms.
2) Compared to A100 GPUs which support 6912 CUDA cores, H100 has 16896 CUDA cores. NVIDIA GPUs have CUDA cores, which are equivalent to CPU cores. They can run many calculations simultaneously, which is essential for modern AI and graphics workloads.
3) The H100 GPU contains 80 billion transistors, while the previous generation of A100-based GPUs had 54.2 billion transistors. This increase in transistors results in faster calculations and processing.
4) The H100 GPU features a secure 2nd Generation Multi-Instance (MIG) GPU with capabilities extended by 7 times the previous version. NVIDIA claims that the new GPU architecture provides about 3x more compute capacity and nearly 2x more memory bandwidth per GPU instance than A100.
5) Confidential computing support in H100 protects user data, defends against hardware and software attacks, and provides better isolation and protection of virtual machines (VMs) from each other in virtualized and MIG environments .
6) H100 GPUs come with 4th Generation NVLink which provides a 3x bandwidth increase on all scaled down operations and an overall 50% bandwidth increase over the previous generation NVLink. NVIDIA’s NVLink is a direct GPU-to-GPU interconnect that scales multi-GPU input/output (IO) within the server.
7) The H100 GPU comes with built-in support for DPX instructions which speeds up dynamic programming algorithms up to 7 times compared to the A100 GPU. Dynamic programming was developed in the 1950s to solve complex problems using two key techniques based on recursion and memorization. Applications that rely on complex SQL queries, quantum simulation, route optimization can take advantage of the DPX instruction set available in H100.
8) The H100 GPU is optimized for processors. The built-in transformer engine uses a combination of software and custom NVIDIA Hopper Tensor Core technology explicitly designed to accelerate transformer model training and inference. Transformers represent the latest developments in neural network architecture for training computer vision and conversational AI models. They are used in large language models such as Google’s BERT and OpenAI’s GPT-3.
9) NVIDIA is updating the HGX AI supercomputing platform with H100 GPUs. The HGX platform enables hardware vendors to design servers optimized for NVIDIA GPUs. The availability schedule for the HGX platform based on the H100 GPUs has yet to be announced.
10) Systems based on H100 GPUs will be available in Q3 2022. These include NVIDIA’s own DGX and DGX SuperPod servers, as well as OEM partner servers and hardware using HGX baseboards and PCIe cards .