AI Generative

NVIDIA Unveils Nemotron Elastic LLMs, Reducing Training Costs by 360x with Nested Models

NVIDIA’s Nemotron Elastic framework slashes LLM training costs by 360x, enabling efficient model creation with minimal resources and enhanced accuracy.

Staff

Published

23 November, 2025

In a significant advancement for the field of artificial intelligence, a team of researchers at NVIDIA has introduced an innovative framework known as Nemotron Elastic. This framework addresses the challenges associated with training large language models (LLMs), specifically the high computational costs and resource demands that typically accompany the process of developing multiple models for various applications. The researchers, including Ali Taghibakhshi, Sharath Turuvekere Sreenivas, and Saurav Muralidharan, have demonstrated that their method can efficiently derive reasoning-oriented language models from a single parent model.

Revolutionizing LLM Training

The primary breakthrough of Nemotron Elastic lies in its ability to create nested sub-models within a single parent structure, allowing for the generation of smaller, optimized models with minimal additional training. This approach results in a remarkable reduction in the computational resources needed—over a 360-fold reduction compared to training models from scratch and a seven-fold improvement over traditional compression methods. Specifically, the research focuses on the Nemotron Nano V2 12B model, from which 9B and 6B versions can be produced using only 110 billion training tokens while preserving or even enhancing accuracy.

The team emphasizes that their method confronts a crucial challenge in the AI landscape: the financial burden associated with maintaining multiple LLMs tailored for various tasks. By adopting a hybrid Mamba-Transformer architecture and emphasizing extended-context training, Nemotron Elastic offers a versatile solution adaptable to different scales and applications. One of the key innovations of this framework is the incorporation of extended-context training, which utilizes sequences of 49,000 tokens. This technique is vital for maintaining reasoning performance in the smaller models derived from the parent model.

Efficiency Through Nested Sub-Networks

Another notable aspect of Nemotron Elastic is its capability to extract smaller, nested models from the larger parent model on-the-fly, requiring no further training or fine-tuning. This enables organizations with limited resources to deploy powerful reasoning models tailored to their specific needs. Importantly, the framework maintains a constant deployment memory footprint regardless of the number of models produced, contrasting sharply with conventional methods where memory requirements increase proportionately with the number of models.

The framework’s efficiency stems from several innovations in its training process. Key components include importance-based component ranking for architectural priorities, frozen teacher knowledge distillation for optimizing joint sub-networks, and an end-to-end trained router that adjusts architectural decisions based on task difficulty. These advancements collectively contribute to the ability of Nemotron Elastic to facilitate efficient training of reasoning models across different computational budgets.

Implications for the AI Community

The introduction of Nemotron Elastic represents a pivotal moment in the AI field, particularly regarding the democratization of access to advanced reasoning models. This breakthrough has the potential to empower organizations with modest computational budgets to leverage sophisticated LLMs tailored to their requirements. Moving forward, researchers are exploring ways to scale this framework to even larger model families and to investigate dynamic routing mechanisms for inference that could further enhance the flexibility and efficiency of LLM deployment.

As the AI landscape evolves, the innovations presented by the NVIDIA team could redefine how organizations approach the development and deployment of large language models, addressing both the economic and technical challenges that have historically limited access to advanced AI capabilities. The implications of this research are vast, potentially enabling a broader range of applications and fostering further advancements in AI reasoning capabilities.

1 Revolutionizing LLM Training
2 Efficiency Through Nested Sub-Networks
3 Implications for the AI Community

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

US Department of Defense partners with tech giants including SpaceX and OpenAI to launch an "AI-first" initiative aimed at enhancing military decision-making efficiency.

Staff3 May, 2026

AI Technology

AMD Launches Ryzen AI Halo Mini-PC with 128GB RAM and NPU for Local AI Development

AMD unveils the Ryzen AI Halo Mini-PC, boasting a 16-core Ryzen AI Max+ 395 APU and the capability to process models with up to...

Staff3 May, 2026

AI Generative

Nvidia Expands Partnerships with Asian Firms, Boosting AI Chip Demand by 90%

Nvidia's partnerships with Asian firms like LG and Nanya surge AI chip demand to 90% of production costs, reshaping the tech landscape in Asia.

Staff3 May, 2026

AI Business

Jensen Huang Critiques AI Doom Predictions, Calls for Fact-Based Discussions

Nvidia CEO Jensen Huang urges industry leaders to avoid alarmist claims about AI's future, citing concerns over inaccurate predictions like a 50% job displacement...

Marcus Chen2 May, 2026

AIPRESSA.COM

AI Generative

NVIDIA Unveils Nemotron Elastic LLMs, Reducing Training Costs by 360x with Nested Models

Revolutionizing LLM Training

Efficiency Through Nested Sub-Networks

Implications for the AI Community

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

AI Technology

AMD Launches Ryzen AI Halo Mini-PC with 128GB RAM and NPU for Local AI Development

AI Generative

Nvidia Expands Partnerships with Asian Firms, Boosting AI Chip Demand by 90%

AI Business

Jensen Huang Critiques AI Doom Predictions, Calls for Fact-Based Discussions

AI Technology

Apple Faces Mac Mini and Studio Shortage as OpenClaw Drives AI Demand Surge

Top Stories

Apple, Google, and Amazon Shine Post-Earnings as AI Demand Reshapes Tech Landscape

Top Stories

Cambricon Reports $423M Q1 Revenue, Surpassing Nvidia’s Market Share in China

Top Stories

Nvidia Launches 7 Million Korean Personas, Enters South Korea’s AI Market with Lock-In Strategy