AI Technology

NVIDIA Reveals Vera Rubin POD: Five Rack-Scale Systems with 60 Exaflops Performance

NVIDIA unveils the Vera Rubin POD, a groundbreaking AI supercomputer delivering 60 exaflops and optimizing training performance by up to 400% for agentic AI systems

Staff

Published

20 March, 2026

NVIDIA has unveiled the Vera Rubin POD, a sophisticated AI supercomputer architecture designed to handle the rapidly evolving demands of agentic AI systems. As AI interactions increasingly shift from human to AI-driven, the need for robust infrastructure capable of supporting this shift is becoming critical. The Vera Rubin POD comprises five specialized rack-scale systems built on NVIDIA’s third-generation MGX architecture, capable of processing over 10 quadrillion tokens per year.

This new system features an impressive 40 racks, housing 1.2 quadrillion transistors and nearly 20,000 NVIDIA dies, including 1,152 NVIDIA Rubin GPUs. It is designed to deliver 60 exaflops of processing power and achieve a staggering 10 petabytes per second of total scale-up bandwidth. Such capabilities are crucial for modern AI workloads, which require high throughput, low-latency inference, and massive context memory storage.

At the core of this architecture is the NVIDIA Vera Rubin NVL72 rack, the compute engine featuring 72 NVIDIA Rubin GPUs and 36 NVIDIA Vera CPUs interconnected through a massive NVLink spine, allowing the rack to function as a single unified GPU. The NVL72 is designed to optimize four scaling laws of AI—pretraining, post-training, test-time scaling, and agentic scaling—resulting in up to four times better training performance and ten times better inference performance per watt compared to previous models.

Complementing the NVL72 rack are dedicated inference accelerator racks known as NVIDIA Groq 3 LPX, featuring a total of 256 language processing units (LPUs) per rack. These LPUs work in conjunction with the NVL72 to eliminate the trade-offs between interactive speed and throughput, achieving up to 35 times more tokens processed and ten times greater revenue opportunities for trillion-parameter models compared to earlier architectures.

To facilitate extensive reinforcement learning environments, the NVIDIA Vera CPU rack integrates up to 256 NVIDIA Vera CPUs, capable of sustaining over 22,500 concurrent environments. This setup maximizes the efficiency of testing and validating results generated from the NVL72 and LPX racks, paving the way for large-scale agentic AI applications.

The NVIDIA BlueField-4 STX rack serves as the AI-native storage solution, extending GPU context capacity across the POD by offloading key-value cache data into a dedicated high-bandwidth storage layer. This innovation allows for a fivefold increase in tokens-per-second and superior power efficiency compared to conventional storage methods.

As for networking capabilities, the NVIDIA Spectrum-6 SPX racks connect the entire POD, engineered to facilitate both east-west and north-south traffic. The inclusion of a new 102.4 Tb/s switch with silicon photonics integration enhances power efficiency and minimizes latency, ensuring that the AI workloads across compute and storage environments remain synchronized.

NVIDIA’s MGX architecture aims to streamline the deployment of these complex rack systems, focusing on modular designs that simplify maintenance and enhance reliability. Dynamic power management features ensure that power is efficiently distributed among CPUs and GPUs, maximizing energy efficiency and performance.

The advancements embodied in the Vera Rubin POD are not merely technical specifications; they represent a significant leap forward in the potential applications of AI, particularly as the landscape shifts toward more autonomous decision-making processes. NVIDIA’s ongoing commitment to open standards and partnerships is set to accelerate the development and deployment of AI infrastructures, enabling organizations to harness the power of AI efficiently.

With NVIDIA’s GTC 2026 approaching, the company is poised to showcase these innovations further, emphasizing their importance in shaping the future of AI technology and its applications across various sectors.

AI Business

Red Hat Reveals Small Language Models as Key to Scaling Enterprise AI Agents

Red Hat advances enterprise AI with Small Language Models that achieve over 98% validity in structured tasks, prioritizing reliability and data sovereignty.

Marcus Chen4 days ago

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

US Department of Defense partners with tech giants including SpaceX and OpenAI to launch an "AI-first" initiative aimed at enhancing military decision-making efficiency.

Staff4 days ago

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

OpenAI's o1 model achieves 81.6% diagnostic accuracy in emergency situations, surpassing human doctors and signaling a major shift in medical practice.

Staff4 days ago

AI Regulation

Korea Ventures Launches AI Initiative to Enhance Fund Management and Policy Efficiency

Korea Venture Investment Corp. unveils AI-driven fund management systems by integrating Nvidia H200 GPUs to enhance efficiency and support unicorn growth.

Staff4 days ago

AI Technology

AMD Launches Ryzen AI Halo Mini-PC with 128GB RAM and NPU for Local AI Development

AMD unveils the Ryzen AI Halo Mini-PC, boasting a 16-core Ryzen AI Max+ 395 APU and the capability to process models with up to...

Staff4 days ago

AI Generative

Nvidia Expands Partnerships with Asian Firms, Boosting AI Chip Demand by 90%

Nvidia's partnerships with Asian firms like LG and Nanya surge AI chip demand to 90% of production costs, reshaping the tech landscape in Asia.

Staff4 days ago

AI Technology

Apple Raises Mac Mini Price to $799 Amid AI-Driven Supply Shortages

Apple raises Mac mini starting price to $799 amid AI-driven inventory shortages, eliminating the $599 model in response to surging demand for advanced computing.

Staff4 days ago

AI Research

IBM Launches Chicago Quantum Hub, Creating 750 AI Jobs and Expanding MIT Research Lab

IBM launches a Chicago Quantum Hub to create 750 AI jobs and expands its MIT partnership to advance quantum computing and AI integration.

Staff4 days ago

AIPRESSA.COM

AI Technology

NVIDIA Reveals Vera Rubin POD: Five Rack-Scale Systems with 60 Exaflops Performance

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Business

Red Hat Reveals Small Language Models as Key to Scaling Enterprise AI Agents

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

AI Regulation

Korea Ventures Launches AI Initiative to Enhance Fund Management and Policy Efficiency

AI Technology

AMD Launches Ryzen AI Halo Mini-PC with 128GB RAM and NPU for Local AI Development

AI Generative

Nvidia Expands Partnerships with Asian Firms, Boosting AI Chip Demand by 90%

AI Technology

Apple Raises Mac Mini Price to $799 Amid AI-Driven Supply Shortages

AI Research

IBM Launches Chicago Quantum Hub, Creating 750 AI Jobs and Expanding MIT Research Lab