NVIDIA A100 Liquid Cooled

NVIDIA^® A100 Liquid Cooled

SKU: NVA100TCGPU80L-KIT

Where to Buy

Description

Certification Request

EOL Notification Form EOL Notification Form

NVIDIA A100 LIQUID COOLED PCIe

Meeting Customer Demands for High-Performance, Green Data Centers

The NVIDIA A100 Liquid Cooled Tensor Core GPU for PCIe delivers the performance required while using far less power—at every scale—to propel the world's highest performing elastic data centers for AI, data analytics, and high performance computing (HPC) applications. As the engine of the NVIDIA data center platform, A100 provides up to 20x higher performance over the prior NVIDIA Volta generation. A100 can efficiently scale up or be partitioned into seven isolated GPU instances, with Multi-Instance GPU (MIG), providing a unified platform that enables elastic data centers to dynamically adjust to shifting workload demands, while offering a road to sustainability as well.

Switching all the CPU-only servers running AI and HPC worldwide to GPU-accelerated systems would save a massive 11 trillion watt-hours of energy a year, the energy consumed by more than 1.5 million homes in a year. Liquid cooling saves water and power by eliminating chillers that evaporate millions of gallons of water a year to cool the air inside data centers, by delivering systems that recycle small amounts of fluids in closed systems focused on key hot spots. NVIDIA estimates that liquid cooled data centers could hit 1.15 PUE (Power Usage Effectiveness), far below the 1.6 for air cooled counterparts. Liquid cooled data centers can pack twice as much computing performance into the same space, too. That's because the NVIDIA A100 Liquid Cooled GPU only uses one PCIe slot, while the passively air cooled A100 GPU board requires two—resulting in up to 66% fewer racks required, and 28% lower power consumption for each rack.

Performance Highlights
CUDA Cores	6912
Streaming Multiprocessors	108
GPU Memory	80GB HBM2e \| ECC on by default
Memory Interface	5120-bit
Memory Bandwidth	1555 GB/s
NVLink	2-way \| Standard or Wide Slot Spacing
MIG (Multi-Instance GPU Support)	Yes \| Up to 7 GPU Instances
FP64	9.7 TFLOPS
FP64 Tensor Core	156 TFLOPS \| 312 TFLOPS Sparsity
FP32	19.5 TFLOPS
TF32 Tensor Core	156 TFLOPS \| 312 TFLOPS Sparsity
FP16 Tensor Core	312 TFLOPS \| 624 TFLOPS Sparsity
INT8 Tensor Core	624 TOPS \| 1248 TOPS Sparsity
INT4 Tensor Core	1248 TOPS \| 2496 TOPS Sparsity
Thermal Solution	Liquid Cooled
vGPU Support	NVIDIA AI Enterprise
System Interface	PCI Express 4.0 x16
Total Board Power	300 W

NVIDIA Ampere-Based Architecture

A100 Liquid Cooled accelerates workloads big and small. Whether using MIG to partition an A100 GPU into smaller instances, or NVLink to connect multiple GPUs to accelerate large-scale workloads, the A100 easily handles different-sized application's needs, from the smallest job to the biggest multi-node workload.

TF32 for AI: 20x higher performance, Zero Code Change

As AI networks and datasets continue to expand exponentially, their computing appetite is similarly growing. Lower precision math has brought huge performance speedups, but they've historically required some code changes. A100 Liquid Cooled brings a new precision, TF32, which works just like FP32 while providing 20x higher FLOPS for AI without requiring any code change. NVIDIA's automatic mixed precision feature enables a further 2x boost to performance with just one additional line of code using FP16 precision. A100 Tensor Cores also include support for BFLOAT16, INT8, and INT4 precision, making A100 an incredibly versatile accelerator for both AI training and inference.

HBM2e

With 80 gigabytes (GB) of high-bandwidth memory (HBM2e), A100 Liquid Cooled delivers improved raw bandwidth of 1.5 TB/s, as well as higher dynamic random access memory (DRAM) utilization efficiency at 95 percent. A100 delivers 1.7x higher memory bandwidth than the previous generation.

Structural Sparsity

AI networks are big, having millions to billions of parameters. Not all of these parameters are needed for accurate predictions, and some can be converted to zeros to make the models "sparse" without compromising accuracy. Tensor Cores in A100 Liquid Cooled can provide up to 2x higher performance for sparse models. While the sparsity feature more readily benefits AI inference, it can also improve the performance of model training.

Every Deep Learning Framework, 700+ GPU-Accelerated Applications

The NVIDIA A100 Liquid Cooled Tensor Core GPU is a key component of the NVIDIA data center platform for deep learning, HPC, and data analytics. It accelerates every major deep learning framework and accelerates over 700 HPC applications. It's available everywhere, from desktops to servers to cloud services, delivering both dramatic performance gains and cost-saving opportunities.

Third-Generation Tensor Cores

First introduced in the NVIDIA Volta architecture, NVIDIA Tensor Core technology has brought dramatic speedups to AI training and inference operations, bringing down training times from weeks to hours and providing massive acceleration to inference. The NVIDIA Ampere architecture builds upon these innovations by providing up to 20x higher FLOPS for AI. It does so by improving the performance of existing precisions and bringing new precisions—TF32, INT8, and FP64—that accelerate and simplify AI adoption and extend the power of NVIDIA Tensor Cores to HPC.

Multi-Instance GPU (MIG)

Every AI and HPC application can benefit from acceleration, but not every application needs the performance of a full A100 Liquid Cooled. With Multi-Instance GPU (MIG), each A100 can be partitioned into as many as seven GPU instances, fully isolated at the hardware level with their own high-bandwidth memory, cache, and compute cores. Now, developers can access breakthrough acceleration for all their applications, big and small, and get guaranteed quality of service (QoS). IT administrators can offer right-sized GPU acceleration for optimal utilization and expand access to every user and application.

MIG is available across bare metal and virtualized environments and is supported by NVIDIA's Container Runtime which supports all major runtimes such as LXC, Docker, CRI-O, Containered, Podman, and Singularity. Each MIG instance is a new GPU type in Kubernetes and will be available across all Kubernetes distributions such as Red hat OpenShift, VMware Project Pacific, and others on-premises and on public clouds via NVIDIA Devise Plugin for Kubernetes. Administrators can also benefit from hypervisor-based virtualization, including KVM based hypervisors such as Red hat RHEL/RHV, and VMware ESXi, on MIG instances with NVIDIA AI Enterprise.

Next Generation NVLink

The NVIDIA A100 Liquid Cooled NVLink implementation delivers 2x higher throughput compared to the previous generation, at up to 600 GB/s to unleash the highest application performance possible on a single server, while promoting energy efficiency. Two NVIDIA A100 Liquid Cooled boards can be bridged via NVLink, and multiple pairs of NVLink connected boards can reside in a single server (number varies based on server enclosure, thermals, and power supply capacity).

Virtualization Capabilities

Virtualized compute workloads such as AI, Deep learning, and high-performance computing (HPC) with NVIDIA AI Enterprise. The NVIDIA A100 Liquid Cooled is an ideal upgrade path for existing V100/V100s Tensor Core GPU infrastructure.

Warranty

3-Year Limited Warranty

Free dedicated phone and email technical support
(1-800-230-0130)

Dedicated NVIDIA professional products Field Application Engineers

Resources

NVIDIA Power Guidelines

Contact gopny@pny.com for additional information.

Features

NVIDIA A100 Liquid Cooled

PERFORMANCE AND USEABILITY FEATURES

Data Center Class Reliability

Designed for 24 x 7 data center operations and driven by power-efficient hardware and components selected for optimum performance, durability, and longevity. Every NVIDIA A100 Liquid Cooled board is designed, built and tested by NVIDIA to the most rigorous quality and performance standards, ensuring that leading OEMs and systems integrators can meet or exceed the most demanding real-world conditions.

NVIDIA Ampere Architecture

NVIDIA A100 Liquid Cooled is one of the world's most powerful data center GPU for AI, data analytics, and high-performance computing (HPC) applications. Building upon the major SM enhancements from the Turing GPU, the NVIDIA Ampere architecture enhances tensor matrix operations and concurrent executions of FP32 and INT32 operations.

More Efficient CUDA Cores

The NVIDIA Ampere architecture’s CUDA cores bring up to 2.5x the single-precision floating point (FP32) throughput compared to the previous generation, providing significant performance improvements for any class or algorithm or application that can benefit from embarrassingly parallel acceleration techniques.

Third Generation Tensor Cores

Purpose-built for deep learning matrix arithmetic at the heart of neural network training and inferencing functions, the NVIDIA A100 Liquid Cooled includes enhanced Tensor Cores that accelerate more datatypes (TF32 and BF16) and includes a new Fine-Grained Structured Sparsity feature that delivers up to 2x throughput for tensor matrix operations compared to the previous generation.

PCIe Gen 4

The NVIDIA A100 Liquid Cooled supports PCI Express Gen 4, which provides double the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI and data science.

High Speed HBM2e Memory

With 80 gigabytes (GB) of high-bandwidth memory (HBM2e), the NVIDIA A100 Liquid Cooled delivers improved raw bandwidth of 1.55TB/sec, as well as higher dynamic random access memory (DRAM) utilization efficiency at 95 percent. A100 PCIe delivers 1.7x higher memory bandwidth over the previous generation.

Error Correction Without a Performance or Capacity Hit

HBM2e memory implements error correction without any performance (bandwidth) or capacity hit, unlike competing technologies like GDDR6 or GDDR6X.

Compute Preemption

Preemption at the instruction-level provides finer grain control over compute and tasks to prevent longer-running applications from either monopolizing system resources or timing out.

MULTI-GPU TECHNOLOGY SUPPORT

Third Generation NVLink

Connect two NVIDIA A100 Liquid Cooled cards with NVLink to double the effective memory footprint and scale application performance by enabling GPU-to-GPU data transfers at rates up to 600 GB/s of bidirectional bandwidth. NVLink bridges are available for motherboards with standard or wide slot spacing.

SOFTWARE SUPPORT

Virtual GPU Software for Virtualization

Support for NVIDIA AI Enterprise accelerates virtualized compute workloads such as high-performance computing, AI, data science, big-data analytics, and HPC applications.

Software Optimized for AI

Deep learning frameworks such as Caffe2, MXNet, CNTK, TensorFlow, and others deliver dramatically faster training times and higher multi-node training performance. GPU accelerated libraries such as cuDNN, cuBLAS, and TensorRT delivers higher performance for both deep learning inference and High-Performance Computing (HPC) applications.

NVIDIA CUDA Parallel Computing Platform

Natively execute standard programming languages like C/C++ and Fortran, and APIs such as OpenCL, OpenACC and Direct Compute to accelerate techniques such as ray tracing, video and image processing, and computation fluid dynamics.

Unified Memory

A single, seamless 49-bit virtual address space allows for the transparent migration of data between the full allocation of CPU and GPU memory.

Specifications

NVIDIA A100 Liquid Cooled

SPECIFICATIONS

Product	NVIDIA A100 Liquid Cooled PCIe
Architecture	Ampere
Process Size	7nm \| TSMC
Transistors	54 Billion
Die Size	826 mm²
CUDA Cores	6912
Streaming Multiprocessors	108
Tensor Cores \| Gen 3	432
Multi-Instance GPU (MIG) Support	Yes, up to seven instances per GPU
FP64	9.7 TFLOPS
Peak FP64 Tensor Core	156 TFLOPS \| 312 TFLOPS Sparsity
Peak FP32	19.5 TFLOPS
TF32 Tensor Core	156 TFLOPS \| 312 TFLOPS Sparsity
Peak FP16 Tensor Core	312 TFLOPS \| 624 TFLOPS Sparsity
Peak INT8 Tensor Core	624 TOPS \| 1248 TOPS Sparsity
INT4 Tensor Core	1248 TOPS \| 2496 TOPS Sparsity
NVLink	2-way \| Standard or Wide Slot Spacing
NVLink Interconnect	600 GB/s Bidirectional
GPU Memory	80 GB HBM2e
Memory Interface	5120-bit
Memory Bandwidth	1555 GB/s
System Interface	PCIe 4.0 x16
Thermal Solution	Liquid Cooled
vGPU Support	NVIDIA AI Enterprise
Power Connector	PCIe 16-pin
Total Board Power	300 W

AVAILABLE ACCESSORIES

RTXA6000NVLINK-KIT provides an NVLink connector for the NVIDIA A100 Liquid Cooled suitable for standard PCIe slot spacing motherboards, effectively fusing two physical boards into one logical entity with 13824 CUDA Cores, 864 Tensor Cores, and 160 GB of HBM2e memory, with a bandwidth of 600 GB/s. Application support is required and each pair of A100 PCIe boards requires three (3) NVLink kits for correct operation.

RTXA6000NVLINK-3S-KIT provides an NVLink connector for the NVIDIA A100 Liquid Cooled for motherboards implementing wider PCIe slot spacing. All other features, benefits, application support, and three (3) NVLink kits per pair of A100 boards is identical to the standard slot spacing version.

SUPPORTED OPERATING SYSTEMS

Windows Server 2012 R2
Windows Server 2016 1607, 1709
Windows Server 2019
RedHat CoreOS 4.7
Red Hat Enterprise Linux 8.1-8.3

Red Hat Enterprise Linux 7.7-7.9
Red Hat Linux 6.6+
SUSE Linux Enterprise Server 15 SP2
SUSE Linux Enterprise Server 12 SP 3+
Ubuntu 14.04 LTS/16.04/18.04 LTS/20.04 LTS

WARRANTY

3-Year Limited Warranty
Free dedicated phone and email technical support (1.800.230.0130)

Dedicated NVIDIA professional products Field Application Engineers

PACKAGE CONTAINS

NVIDIA A100 Liquid Cooled PCIe Board

Auxiliary power cable

NVIDIA A100 Liquid Cooled

NVIDIA® A100 Liquid Cooled

NVIDIA A100 LIQUID COOLED PCIe

Meeting Customer Demands for High-Performance, Green Data Centers

Performance Highlights

CUDA Cores

Streaming Multiprocessors

GPU Memory

Memory Interface

Memory Bandwidth

NVLink

MIG (Multi-Instance GPU Support)

FP64

FP64 Tensor Core

FP32

TF32 Tensor Core

FP16 Tensor Core

INT8 Tensor Core

INT4 Tensor Core

Thermal Solution

vGPU Support

System Interface

Total Board Power

NVIDIA Ampere-Based Architecture

TF32 for AI: 20x higher performance, Zero Code Change

HBM2e

Structural Sparsity

Every Deep Learning Framework, 700+ GPU-Accelerated Applications

Third-Generation Tensor Cores

Multi-Instance GPU (MIG)

Next Generation NVLink

Virtualization Capabilities

Warranty

Resources

NVIDIA A100 Liquid Cooled

PERFORMANCE AND USEABILITY FEATURES

Data Center Class Reliability

NVIDIA Ampere Architecture

More Efficient CUDA Cores

Third Generation Tensor Cores

PCIe Gen 4

High Speed HBM2e Memory

Error Correction Without a Performance or Capacity Hit

Compute Preemption

MULTI-GPU TECHNOLOGY SUPPORT

Third Generation NVLink

SOFTWARE SUPPORT

Virtual GPU Software for Virtualization

Software Optimized for AI

NVIDIA CUDA Parallel Computing Platform

Unified Memory

NVIDIA A100 Liquid Cooled

SPECIFICATIONS

Product

Architecture

Process Size

Transistors

Die Size

CUDA Cores

Streaming Multiprocessors

Tensor Cores | Gen 3

Multi-Instance GPU (MIG) Support

FP64

Peak FP64 Tensor Core

Peak FP32

TF32 Tensor Core

Peak FP16 Tensor Core

Peak INT8 Tensor Core

INT4 Tensor Core

NVLink

NVLink Interconnect

GPU Memory

Memory Interface

Memory Bandwidth

System Interface

Thermal Solution

vGPU Support

Power Connector

Total Board Power

AVAILABLE ACCESSORIES

NVIDIA^® A100 Liquid Cooled