L O A D I N G

Blog Details

  • Home
  • Nvidia Reveals Vera Rubin Architecture to Power the Next Wave of AI Data Centers
By: Admin January 13, 2026

Nvidia Reveals Vera Rubin Architecture to Power the Next Wave of AI Data Centers

Nvidia Reveals Vera Rubin Architecture to Power the Next Wave of AI Data Centers

At CES, Nvidia lifted the curtain on a major evolution of its data center strategy with the introduction of the Vera Rubin NVL72, a rack-scale computing platform built specifically for large-scale AI workloads. Rather than a single chip launch, the announcement focused on a tightly integrated system designed to redefine performance, security, and scalability in modern AI infrastructure.

A Platform Built Around Six Chips, Not One

During his CES keynote, Nvidia CEO Jensen Huang emphasized that Vera Rubin is not just a CPU–GPU pairing. Instead, the platform is powered by six distinct silicon components, combining compute and networking into a unified architecture.

At the core are:

  • Vera, an Arm-based high-performance CPU
  • Rubin, Nvidia’s next-generation GPU platform

Supporting these are four advanced networking processors:

  • NVLink 6 Switch
  • ConnectX-9 SuperNIC
  • BlueField-4 DPU
  • Spectrum-6 Ethernet Switch

Together, these components form the backbone of Nvidia’s next-generation AI racks.

Vera CPU: Designed to Keep GPUs Fully Utilized

The Vera CPU introduces a custom design with 88 cores and 176 threads, leveraging spatial multithreading to push throughput. Each processor supports 1.5 TB of LPDDR5x memory and delivers 1.2 Tbps of memory bandwidth, paired with a 1.8 Tbps NVLink chip-to-chip connection.

According to Nvidia, this interconnect provides roughly seven times the bandwidth of PCIe Gen 6, enabling faster coordination between CPUs and GPUs. The CPU also plays a key role in managing data routing, scheduling, and key-value cache orchestration, all aimed at maximizing accelerator efficiency during training and inference.

Confidential Computing Becomes a First-Class Feature

Vera is Nvidia’s first CPU to fully support confidential computing across the CPU, GPU, and NVLink domains. This Trusted Execution Environment is designed to safeguard sensitive AI assets, including proprietary models, training datasets, and inference pipelines—an increasingly critical requirement for enterprise and government users.

Rubin GPUs Push Performance Well Beyond Blackwell

On the accelerator side, Rubin GPUs bring significant performance gains over the previous Blackwell generation. Using Nvidia’s NVFP4 data format, Rubin delivers:

  • Up to 50 petaflops for inference (around 5× Blackwell)
  • 35 petaflops for training (about 3.5× improvement)

Memory bandwidth also sees a major jump. With HBM4, Rubin reaches 22 Tbps, nearly three times the bandwidth of Blackwell, while NVLink bandwidth per GPU doubles to 3.6 Tbps.

Networking Scales to Match Compute Growth

The Vera Rubin platform places heavy emphasis on networking, highlighted by the liquid-cooled NVLink 6 Switch. It supports:

  • 400G SerDes
  • 28.8 Tbps total switching bandwidth
  • 3.6 Tbps per-GPU connectivity
  • 14.4 TFLOPS of FP8 in-network compute

This networking fabric is designed to eliminate bottlenecks as AI models continue to grow in size and complexity.

NVL72: Rack-Scale AI at Exascale Levels

When fully configured, the Vera Rubin NVL72 rack delivers extraordinary system-level performance:

  • 3.6 exaflops of NVFP4 inference (5× previous generation)
  • 2.5 exaflops of NVFP4 training (3.5× increase)

Memory capacity and bandwidth scale accordingly, with:

  • 54 TB of LPDDR5x system memory
  • 20.7 TB of HBM4
  • 1.6 Pbps of HBM4 bandwidth
  • 260 Tbps of scale-up bandwidth, doubling what Blackwell NVL72 offered

Nvidia described this internal bandwidth as exceeding the capacity of the entire global Internet.

A Foundation for Future AI Data Centers

By combining compute, memory, networking, and security into a tightly coupled rack-scale system, Nvidia positions Vera Rubin as more than a generational upgrade. It represents a shift toward fully integrated AI infrastructure, purpose-built for trillion-parameter models and continuous, always-on operation in hyperscale data centers.

Leave Comment