NVIDIA Data Centre GPUs
NVIDIA® Data Centre GPUs bring the latest parallel GPU processing to a range of applications – from data science, to research, artificial intelligence, machine learning and more. XENON can design a server with proper power, cooling and memory to power single or multiple GPUs. XENON also builds workstation solutions with these GPUs – unleashing the power of GPU computing into a desktop form factor, at home in ambient room temperatures and with standard power supplied. Contact the XENON solutions team to discover which NVIDIA GPU is right for your requirements.
For a limited time, a four hour, self-paced course – AI in the Data Centre – is available for up to 3 team members with each NVIDIA Data Centre GPU purchased. Spaces are limited. Contact us to learn more.
The NVIDIA H100 Tensor Core GPU powered by the NVIDIA Hopper GPU architecture delivers the next massive leap in accelerated computing performance for NVIDIA’s data center platforms.
- A new Transformer Engine enables H100 to deliver up to 9x faster AI training and up to 30x faster AI inference speedups on large language models compared to the prior generation A100
- Combination of fourth-generation NVlink offers 900 gigabytes per second (GB/s) of GPU-to-GPU interconnect
- Delivers 60 teraFLOPS of FP64 computing for HPC
- Delivers 3 terabytes per second (TB/s) of memory bandwidth per GPU and scalability with NVLink and NVSwitch
NVIDIA H100 CNX combines the power of the NVIDIA H100 with the advanced networking capabilities of the NVIDIA ConnectX®-7 smart network interface card (SmartNIC) in a single, unique platform.
- GPU Memory: 80GB HBM2e
- Memory Bandwidth: > 2.0TB/s
- PCIe Gen5 128GB/s
- NVIDIA H100 and ConnectX-7 are connected via an integrated PCIe Gen5 switch
New NVIDIA A100 Tensor Core GPU in a PCIe form factor. Third generation Tensor Cores with 20x performance increase. Multi-Instance GPU capable, the A100 can be configured into 7 vGPU’s individually or in combinations to provide maximum flexibility in data analytics, training and inference.
- GPU Memory: 80GB
- GPU Memory Bandwidth: 1,935 GB/s
- Up to 7 Multi-instance GPU (MIG), with 5GB memory each
- PCIe form factor
- 300W total power draw
New NVIDIA A40 Tensor Core GPU in PCIe Form Factor. Virtualisation ready to deliver flexibility and agility along with the lightening fast performance of NVIDIA Ampere Architecture. All new design to optimise Tensor Cores, memory and PCIe Gen4. The worlds most powerful data centre GPU for visual computing, VR, AI, HPC workloads.
- GPU Memory: 48GB
- GPU Memory Bandwidth: 696 GB/s
- vGPU capable with multiple config options
- PCIe Generation 4 form factor
- 300W power draw, passive thermal
Versatile compute acceleration for mainstream enterprise servers.
- features FP64 NVIDIA Ampere architecture Tensor Cores
- Up to 3X higher throughput than v100 and 6X higher than T4
- 24 gigabytes (GB) of GPU memory
- GPU memory bandwidth of 933 gigabytes per second (GB/s)
Unlock an unprecedented VDI user experience.
- 4x 16GB GDDR6 with error-correcting code (ECC)
- GPU memory bandwidth of 4x 232GB/s
- More than 2x the Encoder Throughput
- Supports multiple, high-resolution monitors (up to two 4K or a single 5K)
Accelerated graphics and video with AI for mainstream enterprise servers.
- Ultra-fast GDDR6 memory, delivering 600 GB/s of bandwidth
- 24GB GDDR6 GPU memory
- Compact, single-slot, 150W GPU
- Tensor Float 32 (TF32) precision provides up to 5X the training throughput
NVIDIA A2’s versatility, compact size, and low power exceed the demands for edge deployments at scale, instantly upgrading existing entry level CPU servers to handle inference.
- Features a low-profile PCIe Gen4 card and a low 40-60W configurable thermal design power (TDP) capability
- Features 16 GB of GDDR6 memory
- Supports x8 PCIe Gen4 connectivity
- Powered by the NVIDIA Ampere Architecture
Powered by NVIDIA Turing™ architecture and purpose-built to boost efficiency for scale-out servers running deep learning workloads.
- Small form-factor, 50/75-Watt design fits any scale-out server.
- INT8 operations slash latency by 15X.
- Hardware-decode engine capable of transcoding and inferencing 35 HD video streams in real time.