NVIDIA H100 NVL
Supercharging large language model inference.
NVIDIA® H100 NVL supercharges large language model inference in mainstream PCIe-based server systems. With increased raw performance, bigger, faster HBM3 memory, and NVIDIA NVLink™ connectivity via bridges, mainstream systems with H100 NVL outperform NVIDIA A100 Tensor Core systems by up to 5X on Llama 2 70B.
Performance Highlights
FP64
|
30 TFLOPS |
FP64 Tensor Core
|
60 TFLOPS |
FP32
|
60 TFLOPS |
TF32 Tensor Core
|
835 TFLOPS | Sparsity |
BFLOAT16 Tensor Core
|
1671 TFLOPS | Sparsity |
FP16 Tensor Core
|
1671 TFLOPS | Sparsity |
FP8 Tensor Core
|
3341 TFLOPS | Sparsity |
INT8 Tensor Core
|
3341 TOPS |
GPU Memory
|
94GB HBM3 |
GPU Memory Bandwidth
|
3.9TB/s |
Maximum Thermal Design Power (TDP)
|
350-400W (Configurable) |
NVIDIA AI Enterprise
|
Included |
Warranty
Free dedicated phone and email technical support
(1-800-230-0130)
Dedicated NVIDIA professional products Field Application Engineers
Resources
Contact gopny@pny.com for additional information.