NVIDIA’s GB200 NVL72 Boosts Kimi K2 Thinking by 10x, Revolutionizing AI Efficiency

NVIDIA’s GB200 NVL72 enhances Moonshot AI’s Kimi K2 Thinking by 10x, revolutionizing efficiency in AI models with 1.4 exaflops performance.

Staff

Published

5 December, 2025

Recent advancements in artificial intelligence (AI) have been highlighted by the introduction of models utilizing a mixture-of-experts (MoE) architecture, notably the Kimi K2 Thinking from Moonshot AI, DeepSeek-R1 from DeepSeek AI, and Mistral Large 3 from Mistral AI. These models have been recognized among the top 10 most intelligent open-source options available and achieve a remarkable 10x performance increase when deployed on NVIDIA’s GB200 NVL72 rack-scale systems. The MoE approach enhances efficiency by engaging only relevant “experts” for each task, thereby facilitating faster and more effective token generation without significantly increasing computational demands.

The MoE architecture, which mirrors the brain’s functionality by dividing tasks among specialized “experts,” represents a paradigm shift in AI design. Traditional models typically activate all parameters for every token, but MoE models selectively engage only a fraction of their large parameter sets—often tens of billions for each token. This strategy has contributed to a nearly 70x increase in model intelligence since early 2023, with over 60% of open-source AI models released this year adopting the MoE framework. This selective activation not only boosts intelligence but also enhances adaptability, allowing for a greater return on investment in terms of energy and capital.

However, scaling MoE models has historically encountered challenges, particularly concerning memory limitations and latency in expert communication. The NVIDIA GB200 NVL72 system addresses these issues through its design, which integrates up to 72 interconnected Blackwell GPUs via NVLink, thereby creating a high-performance interconnect fabric that facilitates rapid data exchange. This setup minimizes the parameter-loading pressure on individual GPUs and allows for enhanced expert parallelism, significantly improving inference times for demanding AI applications.

With a performance capability of 1.4 exaflops and 30TB of shared memory, the GB200 NVL72 is engineered for high efficiency. A crucial feature of this system is the NVLink Switch, which provides 130 TB/s of connectivity, allowing for near-instantaneous information exchange between GPUs. This architecture enables organizations to handle more concurrent users and longer input lengths, thereby enhancing overall performance. Companies like Amazon Web Services, Google Cloud, and Microsoft Azure are already deploying the GB200 NVL72, enabling their clients to leverage these advancements in operational settings.

As noted by Guillaume Lample, cofounder and chief scientist at Mistral AI, “Our pioneering work with OSS mixture-of-experts architecture, starting with Mixtral 8x7B two years ago, ensures advanced intelligence is both accessible and sustainable for a broad range of applications.” This sentiment reflects the growing recognition of MoE models as a viable solution for enhancing AI capabilities while maintaining cost efficiency.

Despite the significant advancements presented by the GB200 NVL72, scaling MoE models remains a complex endeavor. Prior to this system, efforts to distribute experts beyond eight GPUs often faced limitations due to slower networking communication, impeding the advantages of expert parallelism. The latest NVIDIA design, however, alleviates these bottlenecks by decreasing the number of experts each GPU manages, thereby reducing memory load and accelerating communication.

The integration of software optimizations, including the NVIDIA Dynamo framework and NVFP4 format, further enhances the performance of MoE models. Open-source inference frameworks such as TensorRT-LLM, SGLang, and vLLM support these optimizations, promoting the adoption and effective deployment of large-scale MoE architectures. As Vipul Ved Prakash, cofounder and CEO of Together AI, stated, “With GB200 NVL72 and Together AI’s custom optimizations, we are exceeding customer expectations for large-scale inference workloads for MoE models like DeepSeek-V3.”

In conclusion, the deployment of the GB200 NVL72 marks a significant milestone in the evolution of AI infrastructure, particularly for models leveraging the MoE architecture. The ongoing advancements in this area not only promise to enhance AI intelligence but also improve efficiency in handling increasingly complex workloads. As the adoption of MoE models continues to rise, the industry may witness a substantial transformation in how AI applications are developed and scaled, paving the way for future innovations.

For further details on these advancements, visit NVIDIA, Amazon Web Services, and Microsoft.

Runway Secures $315 Million in Series E Funding, Valuation Soars to $5.3 Billion

Runway secures $315 million in Series E funding, boosting its valuation to $5.3 billion to enhance next-gen AI video generation and world modeling technologies

Staff10 hours ago

AI Business

Arinox AI and KOGO Launch India’s First Sovereign AI Box for Enhanced Data Security

Arinox AI and KOGO unveil CommandCORE, India's first sovereign AI box, ensuring greater data security and privacy for enterprises at ₹10 lakh.

Marcus Chen18 hours ago

Mistral AI Invests $1.4B in Nordic Data Centers to Enhance Europe’s A.I. Independence

Mistral AI commits €1.2B to build Nordic data centers, boosting Europe's A.I. autonomy and positioning itself as a rival to OpenAI and Microsoft.

Staff19 hours ago

Akamai Launches NVIDIA-Powered Inference Cloud, Shares Surge 17.5% After Strong Q3 Results

Akamai Technologies reports strong Q3 results with a 17.5% share surge after launching its NVIDIA-powered Inference Cloud, projecting EPS of $6.93 to $7.13.

Staff1 day ago

Hugging Face Rejects Nvidia’s $500 Million Offer to Maintain Strategic Neutrality

Hugging Face rejects Nvidia's $500 million investment to uphold its strategic neutrality and maintain open access for 13 million users in the AI ecosystem.

Staff1 day ago

Global AI Leaders Converge in Delhi for Landmark Summit, Driving India’s Tech Future

India AI Impact Summit 2026 in New Delhi, featuring leaders like Sundar Pichai and Sam Altman, aims to reshape global AI governance and investment...

Staff2 days ago

AI Technology

Nvidia, Broadcom, and TSMC Lead 5 Best AI Stocks to Buy This February

Nvidia and Broadcom are set to benefit from a surge in AI investments, with Nvidia's GPUs leading the market and Microsoft Azure seeing a...

Staff2 days ago

AI Technology

Intel Acquires AI Startup SambaNova, Unveils Z Angle Memory Prototype for Data Centers

Intel acquires AI startup SambaNova to enhance enterprise AI capabilities and introduces Z Angle Memory prototype to address data center workload efficiency.

Staff2 days ago

AIPRESSA.COM

Top Stories

NVIDIA’s GB200 NVL72 Boosts Kimi K2 Thinking by 10x, Revolutionizing AI Efficiency

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

Top Stories

Runway Secures $315 Million in Series E Funding, Valuation Soars to $5.3 Billion

AI Business

Arinox AI and KOGO Launch India’s First Sovereign AI Box for Enhanced Data Security

Top Stories

Mistral AI Invests $1.4B in Nordic Data Centers to Enhance Europe’s A.I. Independence

Top Stories

Akamai Launches NVIDIA-Powered Inference Cloud, Shares Surge 17.5% After Strong Q3 Results

Top Stories

Hugging Face Rejects Nvidia’s $500 Million Offer to Maintain Strategic Neutrality

Top Stories

Global AI Leaders Converge in Delhi for Landmark Summit, Driving India’s Tech Future

AI Technology

Nvidia, Broadcom, and TSMC Lead 5 Best AI Stocks to Buy This February

AI Technology

Intel Acquires AI Startup SambaNova, Unveils Z Angle Memory Prototype for Data Centers