AI Generative

NVIDIA Unveils Nemotron Elastic LLMs, Reducing Training Costs by 360x with Nested Models

NVIDIA’s Nemotron Elastic framework slashes LLM training costs by 360x, enabling efficient model creation with minimal resources and enhanced accuracy.

Staff

Published

23 November, 2025

In a significant advancement for the field of artificial intelligence, a team of researchers at NVIDIA has introduced an innovative framework known as Nemotron Elastic. This framework addresses the challenges associated with training large language models (LLMs), specifically the high computational costs and resource demands that typically accompany the process of developing multiple models for various applications. The researchers, including Ali Taghibakhshi, Sharath Turuvekere Sreenivas, and Saurav Muralidharan, have demonstrated that their method can efficiently derive reasoning-oriented language models from a single parent model.

Revolutionizing LLM Training

The primary breakthrough of Nemotron Elastic lies in its ability to create nested sub-models within a single parent structure, allowing for the generation of smaller, optimized models with minimal additional training. This approach results in a remarkable reduction in the computational resources needed—over a 360-fold reduction compared to training models from scratch and a seven-fold improvement over traditional compression methods. Specifically, the research focuses on the Nemotron Nano V2 12B model, from which 9B and 6B versions can be produced using only 110 billion training tokens while preserving or even enhancing accuracy.

The team emphasizes that their method confronts a crucial challenge in the AI landscape: the financial burden associated with maintaining multiple LLMs tailored for various tasks. By adopting a hybrid Mamba-Transformer architecture and emphasizing extended-context training, Nemotron Elastic offers a versatile solution adaptable to different scales and applications. One of the key innovations of this framework is the incorporation of extended-context training, which utilizes sequences of 49,000 tokens. This technique is vital for maintaining reasoning performance in the smaller models derived from the parent model.

Efficiency Through Nested Sub-Networks

Another notable aspect of Nemotron Elastic is its capability to extract smaller, nested models from the larger parent model on-the-fly, requiring no further training or fine-tuning. This enables organizations with limited resources to deploy powerful reasoning models tailored to their specific needs. Importantly, the framework maintains a constant deployment memory footprint regardless of the number of models produced, contrasting sharply with conventional methods where memory requirements increase proportionately with the number of models.

The framework’s efficiency stems from several innovations in its training process. Key components include importance-based component ranking for architectural priorities, frozen teacher knowledge distillation for optimizing joint sub-networks, and an end-to-end trained router that adjusts architectural decisions based on task difficulty. These advancements collectively contribute to the ability of Nemotron Elastic to facilitate efficient training of reasoning models across different computational budgets.

Implications for the AI Community

The introduction of Nemotron Elastic represents a pivotal moment in the AI field, particularly regarding the democratization of access to advanced reasoning models. This breakthrough has the potential to empower organizations with modest computational budgets to leverage sophisticated LLMs tailored to their requirements. Moving forward, researchers are exploring ways to scale this framework to even larger model families and to investigate dynamic routing mechanisms for inference that could further enhance the flexibility and efficiency of LLM deployment.

As the AI landscape evolves, the innovations presented by the NVIDIA team could redefine how organizations approach the development and deployment of large language models, addressing both the economic and technical challenges that have historically limited access to advanced AI capabilities. The implications of this research are vast, potentially enabling a broader range of applications and fostering further advancements in AI reasoning capabilities.

1 Revolutionizing LLM Training
2 Efficiency Through Nested Sub-Networks
3 Implications for the AI Community

Nvidia Earnings Today: Key for FX Markets Amid Investor Unease and CPI Data

Nvidia's crucial earnings report today could determine the fate of AI stock valuations and impact currencies like AUD, as investor anxiety mounts amid 3.8%...

Staff10 hours ago

AI Technology

Singtel, Nvidia Launch AI Centre to Overcome Deployment Barriers for Enterprises

Singtel partners with Nvidia to launch a multimillion-dollar AI centre of excellence, accelerating enterprise AI deployment and overcoming infrastructure challenges.

Staff23 hours ago

AI Technology

Meta Deploys 6 GW of AMD Processors to Enhance AI Capacity, Compete with Nvidia

Meta partners with AMD to deploy 6 GW of AI-focused data center equipment, investing "double-digit billions" to enhance competitive AI capabilities against Nvidia.

Staff1 day ago

AI Marketing

Meta Signs AI Chip Deal with AMD, Following Commitment to Deploy Millions of Nvidia GPUs

Meta partners with AMD for AI chip development, enhancing its strategy to deploy millions of Nvidia GPUs, signaling a major shift in its tech...

Sofía Méndez1 day ago

Meta Announces $100B AMD AI Infrastructure Deal, Boosting AMD Shares by 14%

Meta commits over $100B to a five-year AI infrastructure deal with AMD, boosting AMD shares by 14% and solidifying its position as a top...

Staff1 day ago

AI Technology

NVIDIA Partners with Meta to Launch Hyperscale AI Infrastructure and Confidential Computing

NVIDIA partners with Meta to deploy millions of GPUs and enhance AI infrastructure, aiming for significant performance improvements in data centers worldwide.

Staff1 day ago

AI Technology

Nvidia Increases Intel Stake, Enhances AI Partnership Amid CPU Rivalry

Nvidia boosts its Intel stake to 79.8%, enhancing their AI collaboration amid a complex rivalry in the CPU market, reshaping the semiconductor landscape.

Staff1 day ago

Australia Launches First Cisco Secure AI Factory with NVIDIA’s 1,024 Blackwell GPUs

Cisco and SharonAI launch Australia's first Secure AI Factory, powered by 1,024 NVIDIA Blackwell Ultra GPUs, enhancing sovereign AI infrastructure.

Staff2 days ago

AIPRESSA.COM

AI Generative

NVIDIA Unveils Nemotron Elastic LLMs, Reducing Training Costs by 360x with Nested Models

Revolutionizing LLM Training

Efficiency Through Nested Sub-Networks

Implications for the AI Community

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

Top Stories

Nvidia Earnings Today: Key for FX Markets Amid Investor Unease and CPI Data

AI Technology

Singtel, Nvidia Launch AI Centre to Overcome Deployment Barriers for Enterprises

AI Technology

Meta Deploys 6 GW of AMD Processors to Enhance AI Capacity, Compete with Nvidia

AI Marketing

Meta Signs AI Chip Deal with AMD, Following Commitment to Deploy Millions of Nvidia GPUs

Top Stories

Meta Announces $100B AMD AI Infrastructure Deal, Boosting AMD Shares by 14%

AI Technology

NVIDIA Partners with Meta to Launch Hyperscale AI Infrastructure and Confidential Computing

AI Technology

Nvidia Increases Intel Stake, Enhances AI Partnership Amid CPU Rivalry

Top Stories

Australia Launches First Cisco Secure AI Factory with NVIDIA’s 1,024 Blackwell GPUs