AI Generative

NVIDIA Unveils Nemotron Elastic LLMs, Reducing Training Costs by 360x with Nested Models

NVIDIA’s Nemotron Elastic framework slashes LLM training costs by 360x, enabling efficient model creation with minimal resources and enhanced accuracy.

Staff

Published

23 November, 2025

In a significant advancement for the field of artificial intelligence, a team of researchers at NVIDIA has introduced an innovative framework known as Nemotron Elastic. This framework addresses the challenges associated with training large language models (LLMs), specifically the high computational costs and resource demands that typically accompany the process of developing multiple models for various applications. The researchers, including Ali Taghibakhshi, Sharath Turuvekere Sreenivas, and Saurav Muralidharan, have demonstrated that their method can efficiently derive reasoning-oriented language models from a single parent model.

Revolutionizing LLM Training

The primary breakthrough of Nemotron Elastic lies in its ability to create nested sub-models within a single parent structure, allowing for the generation of smaller, optimized models with minimal additional training. This approach results in a remarkable reduction in the computational resources needed—over a 360-fold reduction compared to training models from scratch and a seven-fold improvement over traditional compression methods. Specifically, the research focuses on the Nemotron Nano V2 12B model, from which 9B and 6B versions can be produced using only 110 billion training tokens while preserving or even enhancing accuracy.

The team emphasizes that their method confronts a crucial challenge in the AI landscape: the financial burden associated with maintaining multiple LLMs tailored for various tasks. By adopting a hybrid Mamba-Transformer architecture and emphasizing extended-context training, Nemotron Elastic offers a versatile solution adaptable to different scales and applications. One of the key innovations of this framework is the incorporation of extended-context training, which utilizes sequences of 49,000 tokens. This technique is vital for maintaining reasoning performance in the smaller models derived from the parent model.

Efficiency Through Nested Sub-Networks

Another notable aspect of Nemotron Elastic is its capability to extract smaller, nested models from the larger parent model on-the-fly, requiring no further training or fine-tuning. This enables organizations with limited resources to deploy powerful reasoning models tailored to their specific needs. Importantly, the framework maintains a constant deployment memory footprint regardless of the number of models produced, contrasting sharply with conventional methods where memory requirements increase proportionately with the number of models.

The framework’s efficiency stems from several innovations in its training process. Key components include importance-based component ranking for architectural priorities, frozen teacher knowledge distillation for optimizing joint sub-networks, and an end-to-end trained router that adjusts architectural decisions based on task difficulty. These advancements collectively contribute to the ability of Nemotron Elastic to facilitate efficient training of reasoning models across different computational budgets.

Implications for the AI Community

The introduction of Nemotron Elastic represents a pivotal moment in the AI field, particularly regarding the democratization of access to advanced reasoning models. This breakthrough has the potential to empower organizations with modest computational budgets to leverage sophisticated LLMs tailored to their requirements. Moving forward, researchers are exploring ways to scale this framework to even larger model families and to investigate dynamic routing mechanisms for inference that could further enhance the flexibility and efficiency of LLM deployment.

As the AI landscape evolves, the innovations presented by the NVIDIA team could redefine how organizations approach the development and deployment of large language models, addressing both the economic and technical challenges that have historically limited access to advanced AI capabilities. The implications of this research are vast, potentially enabling a broader range of applications and fostering further advancements in AI reasoning capabilities.

1 Revolutionizing LLM Training
2 Efficiency Through Nested Sub-Networks
3 Implications for the AI Community

AI Business

AI Factories Revolutionize Enterprise Efficiency: NVIDIA Partners with Lenovo for Gigawatt-Scale Production

NVIDIA and Lenovo unveil gigawatt-scale AI factories, poised to enhance enterprise AI production and efficiency, driving trillions in investments.

Marcus Chen1 hour ago

AI Technology

Alibaba Stock Rises 4.8% Following China’s Approval of Nvidia H200 AI Chips

Alibaba's stock surged 4.8% to $151.57 after China approved imports of Nvidia's H200 AI chips, boosting investor optimism for AI growth in China.

Staff6 hours ago

Experts Warn: Prepare for Potential AI Bubble in 2026 Amid Market Volatility

Investors eye potential AI bubble as Nvidia reports 1,330% returns over five years, prompting experts to advocate for strategic dollar-cost averaging by 2026

Staff12 hours ago

CES 2026 Wraps Up: Major AI Breakthroughs from Nvidia, AMD, and Robot Innovations by Hyundai

CES 2026 showcased groundbreaking advancements in physical AI, highlighted by Nvidia and AMD's keynotes on cutting-edge technology and Hyundai's humanoid robot Atlas attracting massive...

Staff18 hours ago

AIPRESSA.COM

AI Generative

NVIDIA Unveils Nemotron Elastic LLMs, Reducing Training Costs by 360x with Nested Models

Revolutionizing LLM Training

Efficiency Through Nested Sub-Networks

Implications for the AI Community

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

AI Business

AI Factories Revolutionize Enterprise Efficiency: NVIDIA Partners with Lenovo for Gigawatt-Scale Production

AI Technology

Alibaba Stock Rises 4.8% Following China’s Approval of Nvidia H200 AI Chips

Top Stories

Experts Warn: Prepare for Potential AI Bubble in 2026 Amid Market Volatility

Top Stories

CES 2026 Wraps Up: Major AI Breakthroughs from Nvidia, AMD, and Robot Innovations by Hyundai

AI Technology

NVIDIA and AMD Reveal AI’s Physical Integration: Smart Glasses and Yottabyte Era Ahead

Top Stories

Uncover 2 AI Stocks Set to Surge: Nvidia & Meta Could Lead You to Millions

Top Stories

Investors Should Buy Nvidia, Broadcom, and Amazon to Capitalize on AI Market Surge

Top Stories

Lenovo Launches QIRA AI Agent and Gigafactory Partnership with NVIDIA at CES 2026