NVIDIA’s GB200 NVL72 Achieves 10x Speed Boost for Mixture-of-Experts AI Models

NVIDIA’s GB200 NVL72 accelerates mixture-of-experts AI models, achieving a 10x speed boost and 1.4 exaflops of performance for major players like DeepSeek AI and Mistral AI.

Staff

Published

3 December, 2025

The emergence of advanced AI models has taken a significant leap forward with the adoption of **mixture-of-experts** (MoE) architecture, a design that has quickly become the standard among leading open-source models. A recent analysis from the Artificial Analysis leaderboard reveals that the top 10 most intelligent models, including **DeepSeek AI’s DeepSeek-R1**, **Moonshot AI’s Kimi K2 Thinking**, and **OpenAI’s gpt-oss-120B**, utilize this innovative architecture, marking a noteworthy trend in AI development.

MoE models optimize performance by activating only a subset of specialized “experts” for each task, mimicking the brain’s efficiency. This selective activation allows for faster and more efficient token generation without necessitating a proportional increase in computational resources. The benefits of this architecture are amplified when deployed on **NVIDIA’s GB200 NVL72** systems, which offer notable performance enhancements; for instance, Kimi K2 Thinking operates 10 times faster on the GB200 compared to its predecessor, the **NVIDIA HGX H200**.

Despite the advantages of MoE architecture, scaling these models in production environments has proven difficult. The GB200 NVL72 combines extreme codesign features that enhance both hardware and software, enabling practical scalability for MoE models. With **72 NVIDIA Blackwell GPUs** working in unison, the system delivers **1.4 exaflops of AI performance** and **30TB of fast shared memory**, resolving common bottlenecks associated with MoE deployment.

This architecture enables lower memory pressure per GPU by distributing the workload across a larger number of GPUs, thereby allowing each expert model to operate more effectively. The NVLink interconnect fabric facilitates instantaneous communication between GPUs, which is crucial for achieving the rapid exchanges of information necessary for effective model functioning. As a result, the GB200 NVL72 allows for the utilization of more experts, thereby enhancing both performance and efficiency.

Comments from industry leaders underscore the significance of this technological advancement. **Guillaume Lample**, cofounder and chief scientist at **Mistral AI**, stated, “Our pioneering work with OSS mixture-of-experts architecture… ensures advanced intelligence is both accessible and sustainable for a broad range of applications.” Notably, **Mistral Large 3** has also realized a 10x performance gain on the GB200 NVL72, reinforcing the growing trend toward MoE models.

Beyond mere performance gains, the GB200 NVL72 is set to transform the economics of AI. NVIDIA’s latest advancements have fostered a 10x improvement in performance per watt, allowing for a corresponding increase in token revenue. This transformation is essential for data centers constrained by power and cost considerations, making MoE models not just a technical innovation but a strategic business advantage. Companies such as **DeepL** and **Fireworks AI** are already leveraging this architecture to push the boundaries of AI capabilities.

As the AI landscape continues to evolve, the future appears increasingly reliant on advanced architectures like MoE. The advent of multimodal AI models, which incorporate various specialized components akin to MoE, signifies a shift towards shared pools of experts that can address diverse applications efficiently. This trend points to a growing recognition of MoE as a foundational building block for scalable, intelligent AI systems.

The NVIDIA GB200 NVL72 is not just a platform for MoE but a pivotal development in the ongoing journey towards more efficient, powerful AI architectures. As companies integrate these systems into their operational frameworks, they stand to redefine the parameters of AI performance and efficiency, paving the way for a new era in intelligent computing.

AI Business

Pentagon Integrates ChatGPT into GenAI.mil, Expanding Access to 3M Defense Personnel

Pentagon partners with OpenAI to integrate ChatGPT into GenAI.mil, granting 3 million personnel access to advanced AI capabilities for enhanced mission readiness.

Marcus Chen15 hours ago

AI Education

UGA Launches $800K AI Pilot Program for Students, Access to ChatGPT Edu and Gemini Pro

UGA invests $800,000 to launch a pilot program providing students access to premium AI tools like ChatGPT Edu and Gemini Pro starting spring 2026.

David Park19 hours ago

AI Generative

OpenAI Retires GPT-4o, Sparking Outcry Among AI Companion Community

OpenAI has retired the GPT-4o model, impacting 0.1% of users who formed deep emotional bonds with the AI as it transitions to newer models...

Staff21 hours ago

Runway Secures $315 Million in Series E Funding, Valuation Soars to $5.3 Billion

Runway secures $315 million in Series E funding, boosting its valuation to $5.3 billion to enhance next-gen AI video generation and world modeling technologies

Staff23 hours ago

AI Generative

ChatBCI Launches P300 Speller BCI with Context-Driven Word Prediction Using GPT-3.5

ChatBCI introduces a pioneering P300 speller BCI that integrates GPT-3.5 for dynamic word prediction, enhancing communication speed for users with disabilities.

Staff1 day ago

Microsoft’s Mustafa Suleyman Announces Shift to AI Self-Sufficiency, Aims for Superintelligence

Microsoft’s AI chief Mustafa Suleyman outlines a bold shift to self-sufficiency by developing proprietary models, aiming for superintelligence and reducing reliance on OpenAI.

Staff1 day ago

AI Business

Arinox AI and KOGO Launch India’s First Sovereign AI Box for Enhanced Data Security

Arinox AI and KOGO unveil CommandCORE, India's first sovereign AI box, ensuring greater data security and privacy for enterprises at ₹10 lakh.

Marcus Chen1 day ago

Mistral AI Invests $1.4B in Nordic Data Centers to Enhance Europe’s A.I. Independence

Mistral AI commits €1.2B to build Nordic data centers, boosting Europe's A.I. autonomy and positioning itself as a rival to OpenAI and Microsoft.

Staff1 day ago

AIPRESSA.COM

Top Stories

NVIDIA’s GB200 NVL72 Achieves 10x Speed Boost for Mixture-of-Experts AI Models

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

AI Business

Pentagon Integrates ChatGPT into GenAI.mil, Expanding Access to 3M Defense Personnel

AI Education

UGA Launches $800K AI Pilot Program for Students, Access to ChatGPT Edu and Gemini Pro

AI Generative

OpenAI Retires GPT-4o, Sparking Outcry Among AI Companion Community

Top Stories

Runway Secures $315 Million in Series E Funding, Valuation Soars to $5.3 Billion

AI Generative

ChatBCI Launches P300 Speller BCI with Context-Driven Word Prediction Using GPT-3.5

Top Stories

Microsoft’s Mustafa Suleyman Announces Shift to AI Self-Sufficiency, Aims for Superintelligence

AI Business

Arinox AI and KOGO Launch India’s First Sovereign AI Box for Enhanced Data Security

Top Stories

Mistral AI Invests $1.4B in Nordic Data Centers to Enhance Europe’s A.I. Independence