NVIDIA’s GB200 NVL72 Achieves 10x Speed Boost for Mixture-of-Experts AI Models

NVIDIA’s GB200 NVL72 accelerates mixture-of-experts AI models, achieving a 10x speed boost and 1.4 exaflops of performance for major players like DeepSeek AI and Mistral AI.

Staff

Published

3 December, 2025

The emergence of advanced AI models has taken a significant leap forward with the adoption of **mixture-of-experts** (MoE) architecture, a design that has quickly become the standard among leading open-source models. A recent analysis from the Artificial Analysis leaderboard reveals that the top 10 most intelligent models, including **DeepSeek AI’s DeepSeek-R1**, **Moonshot AI’s Kimi K2 Thinking**, and **OpenAI’s gpt-oss-120B**, utilize this innovative architecture, marking a noteworthy trend in AI development.

MoE models optimize performance by activating only a subset of specialized “experts” for each task, mimicking the brain’s efficiency. This selective activation allows for faster and more efficient token generation without necessitating a proportional increase in computational resources. The benefits of this architecture are amplified when deployed on **NVIDIA’s GB200 NVL72** systems, which offer notable performance enhancements; for instance, Kimi K2 Thinking operates 10 times faster on the GB200 compared to its predecessor, the **NVIDIA HGX H200**.

Despite the advantages of MoE architecture, scaling these models in production environments has proven difficult. The GB200 NVL72 combines extreme codesign features that enhance both hardware and software, enabling practical scalability for MoE models. With **72 NVIDIA Blackwell GPUs** working in unison, the system delivers **1.4 exaflops of AI performance** and **30TB of fast shared memory**, resolving common bottlenecks associated with MoE deployment.

This architecture enables lower memory pressure per GPU by distributing the workload across a larger number of GPUs, thereby allowing each expert model to operate more effectively. The NVLink interconnect fabric facilitates instantaneous communication between GPUs, which is crucial for achieving the rapid exchanges of information necessary for effective model functioning. As a result, the GB200 NVL72 allows for the utilization of more experts, thereby enhancing both performance and efficiency.

Comments from industry leaders underscore the significance of this technological advancement. **Guillaume Lample**, cofounder and chief scientist at **Mistral AI**, stated, “Our pioneering work with OSS mixture-of-experts architecture… ensures advanced intelligence is both accessible and sustainable for a broad range of applications.” Notably, **Mistral Large 3** has also realized a 10x performance gain on the GB200 NVL72, reinforcing the growing trend toward MoE models.

Beyond mere performance gains, the GB200 NVL72 is set to transform the economics of AI. NVIDIA’s latest advancements have fostered a 10x improvement in performance per watt, allowing for a corresponding increase in token revenue. This transformation is essential for data centers constrained by power and cost considerations, making MoE models not just a technical innovation but a strategic business advantage. Companies such as **DeepL** and **Fireworks AI** are already leveraging this architecture to push the boundaries of AI capabilities.

As the AI landscape continues to evolve, the future appears increasingly reliant on advanced architectures like MoE. The advent of multimodal AI models, which incorporate various specialized components akin to MoE, signifies a shift towards shared pools of experts that can address diverse applications efficiently. This trend points to a growing recognition of MoE as a foundational building block for scalable, intelligent AI systems.

The NVIDIA GB200 NVL72 is not just a platform for MoE but a pivotal development in the ongoing journey towards more efficient, powerful AI architectures. As companies integrate these systems into their operational frameworks, they stand to redefine the parameters of AI performance and efficiency, paving the way for a new era in intelligent computing.

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

US Department of Defense partners with tech giants including SpaceX and OpenAI to launch an "AI-first" initiative aimed at enhancing military decision-making efficiency.

Staff3 May, 2026

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

OpenAI's o1 model achieves 81.6% diagnostic accuracy in emergency situations, surpassing human doctors and signaling a major shift in medical practice.

Staff3 May, 2026

AI Technology

AMD Launches Ryzen AI Halo Mini-PC with 128GB RAM and NPU for Local AI Development

AMD unveils the Ryzen AI Halo Mini-PC, boasting a 16-core Ryzen AI Max+ 395 APU and the capability to process models with up to...

Staff3 May, 2026

AI Generative

Nvidia Expands Partnerships with Asian Firms, Boosting AI Chip Demand by 90%

Nvidia's partnerships with Asian firms like LG and Nanya surge AI chip demand to 90% of production costs, reshaping the tech landscape in Asia.

Staff3 May, 2026

AI Generative

OpenAI Launches GPT Image 2, Surpassing Google Nano Banana 2 in Key Categories

OpenAI unveils GPT Image 2, achieving a record 242-point lead over competitors, transforming the AI image generation landscape with native reasoning capabilities.

Staff2 May, 2026

AI Business

Jensen Huang Critiques AI Doom Predictions, Calls for Fact-Based Discussions

Nvidia CEO Jensen Huang urges industry leaders to avoid alarmist claims about AI's future, citing concerns over inaccurate predictions like a 50% job displacement...

Marcus Chen2 May, 2026