NVIDIA T4

SKU: TCST4M-PB

Where to Buy Request a Quote

Description

NVIDIA T4

Next-Level Acceleration Has Arrived

The artificial intelligence revolution surges forward, igniting opportunities for businesses to reimagine how they solve their customers’ challenges. We’re racing toward a future where every customer interaction, every product, every service offering will be touched and improved by AI. And making that future a reality requires a computing platform that can accelerate the full diversity of modern AI, enabling businesses to re-envision how they meet—and exceed—customer demands and cost-effectively scale their AI-based products and services.

The NVIDIA T4 GPU is among the world’s most powerful universal inference accelerators. Powered by NVIDIA Turing Tensor Cores, T4 provides revolutionary multi-precision inference performance to accelerate the diverse applications of modern AI. T4 is a part of the NVIDIA AI inference platform that supports all AI frameworks and provides comprehensive tooling and integrations to drastically simplify the development and deployment of advanced AI.

Highlights

GPU Architecture	NVIDIA Turing
Turing Tensor Cores	320
NVIDIA CUDA Cores	2560
Peak FP32	8.1 TFLOPS
Mixed Precision \| FP16/FP32	65 TFLOPS
INT8	130 TOPS
INT4	260 TOPS
GPU Memory	16 GB GDDR6
Memory Bandwidth	300 GB/s
Thermal Solution	Passive
Maximum Power Consumption	70 W
System Interface	PCIe Gen 3.0 x16
Compute APIs	CUDA \| NVIDIA TensorRT \| ONYX

Turing Tensor Cores: The Heart of Universal Inference Acceleration

AI is evolving rapidly. In the past few years alone, a Cambrian explosion of neural network types has seen the emergence of convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial networks (GANs), reinforcement learning (RL), and hybrid network architectures. Accelerating these diverse models requires both high performance and programmability.

NVIDIA T4 introduces the revolutionary Turing Tensor Core technology with multi-precision computing for AI inference. Powering breakthrough performance from FP32 to FP16 to INT8, as well as INT4 and binary precisions, T4 delivers dramatically higher performance than CPUs.

Developers can unleash the power of Turing Tensor Cores directly through NVIDIA TensorRT, software libraries and integrations with all AI frameworks. These tools let developers target optimal precision for different AI applications, achieving dramatic performance gains without compromising accuracy of results.

State-of-the-art Inference in Real-Time

Responsiveness is key to user engagement for services such as conversational AI, recommender systems, and visual search. As models increase in accuracy and complexity, delivering the right answer right now requires exponentially larger compute capability.

NVIDIA T4 features multi-process service (MPS) with hardware-accelerated work distribution. MPS reduces latency for processing requests, and enables multiple independent requests to be simultaneously processed, resulting in higher throughput and efficient utilization of GPUs.

Twice the Video Decode Performance

Video continues on its explosive growth trajectory, comprising over two-thirds of all Internet traffic. Accurate video interpretation through AI is driving the most relevant content recommendations, finding the impact of brand placements in sports events, and delivering perception capabilities to autonomous vehicles, among other usages.

NVIDIA T4 delivers breakthrough performance for AI video applications, with dedicated hardware transcoding engines that bring twice the decoding performance of prior-generation GPUs. T4 can decode up to 38 full-HD video streams, making it easy to integrate scalable deep learning into the video pipeline to deliver innovative, smart video services. It features performance and efficiency modes to enable either fast encoding or the lowest bit-rate encoding without losing video quality.

Industry’s Most Comprehensive AI Inference Platform

AI has crossed the chasm and is rapidly moving from early adoption by pioneers to broader use across industries and large-scale production deployments. Powered by the flexible NVIDIA CUDA development environment and a mature ecosystem with over 1M developers, NVIDIA AI Platform has been evolving for over a decade to offer comprehensive tooling and integrations to simplify the development and deployment of AI.

NVIDIA TensorRT enables optimization of trained models to efficiently run inference on GPUs. NVIDIA ATTIS and Kubernetes on NVIDIA GPUs streamline the deployment and scaling of AI-powered applications on GPU-accelerated infrastructure for inference. Libraries like cuDNN, cuSPARSE, CUTLASS, and DeepStream accelerate key neural network functions and use cases, like video transcoding. And workflow integrations with all AI frameworks freely available from NVIDIA GPU Cloud containers enable developers to transparently harness the innovations in GPU computing for end-to- end AI workflows, from training neural networks to running inference in production applications.

Warranty

3-Year Limited Warranty

Dedicated NVIDIA professional products Field Application Engineers

Resources

Product Brochure
Product Brief

Links

Resource Center
NVIDIA GPU Accelerated Applications Catalog

Contact pnypro@pny.eu for additional information.