NVIDIA® A30

SKU: TCSA30M-PB

Where to Buy

Description

NVIDIA A30

Versatile Compute Acceleration for Mainstream Enterprise Servers

Bring accelerated performance to every enterprise workload with NVIDIA A30 Tensor Core GPUs. With NVIDIA Ampere architecture Tensor Cores and Multi-Instance GPU (MIG), it delivers speedups securely across diverse workloads, including AI inference at scale and high-performance computing (HPC) applications. By combining fast memory bandwidth and low-power consumption in a PCIe form factor optimized for mainstream servers, A30 enables an elastic data center and delivers maximum value for enterprises.

The NVIDIA A30 Tensor Core GPU delivers a versatile platform for mainstream enterprise workloads, like AI inference, training, and HPC. With TF32 and FP64 Tensor Core support, as well as an end-to-end software and hardware solution stack, A30 ensures that mainstream AI training and HPC applications can be rapidly addressed. Multi-instance GPU (MIG) ensures quality of service (QoS) with secure, hardware-partitioned, right-sized GPUs across all of these workloads for diverse users, optimally utilizing GPU compute resources.

Highlights

CUDA Cores	3804
Tensor Cores	224
Peak FP64	5.2 TFLOPS
Peak FP64 Tensor Core	10.3 TFLOPS
Peak FP32	10.3 TFLOPS
TF32 Tensor Core	82 TFLOPS \| 165 TFLOPS*
BFLOAT16 Tensor Core	165 TFLOPS \| 330 TFLOPS*
Peak FP16 Tensor Core	165 TFLOPS \| 330 TFLOPS*
Peak INT8 Tensor Core	330 TOPS \| 661 TOPS*
GPU Memory	24 GB HBM2
Memory Bandwidth	933 GB/s
Thermal Solutions	Passive
Maximum Power Consumption	165 W
System Interface	PCIe Gen 4.0 \| 64 GB/s
Multi-Instance GPU Support	Yes
vGPU Support	Yes

*With sparsity

Deep Learning Training

A30 leverages groundbreaking features to optimize inference workloads. It accelerates a full range of precisions, from FP64 to TF32 and INT4. Supporting up to four MIGs per GPU, A30 lets multiple networks operate simultaneously in secure hardware partitions with guaranteed quality of service (QoS). Structural sparsity support delivers up to 2x more performance on top of A30’s other inference performance gains. NVIDIA’s market-leading AI performance was demonstrated in MLPerf Inference. Combined with the NVIDIA Triton Inference Server, which easily deploys AI at scale, A30 brings this groundbreaking performance to every enterprise.

Deep Learning Training

A30 leverages groundbreaking features to optimize inference workloads. It accelerates a full range of precisions, from FP64 to TF32 and INT4. Supporting up to four MIGs per GPY, A30 lets multiple networks operate simultaneously in secure hardware partitions with guaranteed quality of service (QoS). Structural sparsity support delivers up to 2x more performance on top of A30’s other inference performance gains. NVIDIA’s market-leading AI performance was demonstrated in MLPerf Inference. Combined with the NVIDIA Triton Inference Server, which easily deploys AI at scale, A30 brings this groundbreaking performance to every enterprise.

High Performance Computing

NVIDIA A30 features FP64 NVIDIA Ampere architecture Tensor Cores that deliver the biggest leap in HPC performance since the introduction of GPUs. Combined with 24 gigabytes (GB) of GPU memory with a bandwidth of 933 gigabytes per second (GB/s), researchers can rapidly solve double-precision calculations. HPC applications can also leverage TF32 to achieve higher throughout for single-precision, dense matrix multiply operations. The combination pf FP64 Tensor Cores and MIG empowers research institutions to securely partition the GPU to allow multiple researchers access to compute resources with guaranteed QoS and maximum GPU utilization. Enterprises deploying AI can use A30’s inference capabilities during peak demand periods and then repurpose the same compute servers for HPC and AI training workloads during off-peak periods.

High Performance Data Analytics

Data scientists need to be able to analyze, visualize, and turn massive datasets into insights. But scale-out solutions are often bogged down by datasets scattered across multiple servers. Accelerated servers with A30 provide the needed compute power – along with large HBM2 memory, 933 GB/s of memory bandwidth, and scalability with NVLink – to tackle these workloads. Combined with NVIDIA InfiniBand, NVIDIA Magnum IO and the RAPIDS site of open-source libraries, including the RAPIDS Accelerator for Apache Spark, the NVIDIA data center platform accelerates these huge workloads at unprecedented levels of performance and efficiency.

Enterprise -Ready Utilization

A30 with MIG maximizes the utilization of GPU accelerated infrastructure. With MIG, an A30 GPU can be partitioned into as many as four independent instances, giving multiple users access to GPU acceleration. MIG works with Kubernetes, containers, and hypervisor-based server virtualization. MIG lets infrastructure managers offer a right-sized GPU with guaranteed QoS for every job, extending the reach of accelerated computing resources to every user.

vGPU Software Support

NVIDIA Virtual PC (vPC)
NVIDIA Virtual Applications (vApps)
NVIDIA RTX Virtual Workstation (vWS)
NVIDIA Virtual Compute Server (vCS)
vGPU Profiles from 1 GB to 24 GB

Warranty

3-Year Limited Warranty

Dedicated NVIDIA professional products Field Application Engineers

Resources

Product Brochure
NVIDIA Power Guidelines

Links

Resource Center
NVIDIA GPU Accelerated Applications Catalog

Contact pnypro@pny.eu for additional information.