AI Research

Google Reveals Nested Learning to Combat Catastrophic Forgetting in LLMs

Google introduces nested learning to enhance LLMs’ adaptability, achieving superior performance with its HOPE architecture, surpassing competitors like Transformer++ and RetNet.

Staff

Published

23 November, 2025

Google Research has introduced a novel paradigm known as nested learning, designed to address the persistent challenge of catastrophic forgetting in large language models (LLMs) and facilitate continuous learning. In their paper presented at NeurIPS 2025, the researchers elucidate a critical limitation of current LLMs: their inability to form new long-term memories post-training. Typically, these models can only retain information available within their context window or revert to knowledge acquired during pretraining. This limitation is akin to managing amnesia with an expanded notepad—while it may provide temporary relief, it does not tackle the underlying issue.

Once pretrained, most models remain static in their knowledge acquisition; they can execute tasks they were trained on but fail to acquire new skills beyond their pre-established context. This leads to catastrophic forgetting, where the introduction of new data further compromises the model’s performance. Each new update exacerbates this issue, limiting the model’s ability to adapt.

Technical Approach

Nested learning draws inspiration from neuroscience, particularly the brain’s mechanisms for memory processing. The human brain operates with varying speeds: rapid circuits address immediate tasks, while slower circuits consolidate significant patterns into long-term storage. The dynamic interplay of these systems showcases the brain’s capacity for neuroplasticity, allowing it to reconfigure itself and retain critical information over time. In contrast, LLMs are shackled to a static representation of knowledge, confined to either their context window or the static pretraining phase.

In nested learning, every component of an AI model—including the optimizer and the training algorithm—is conceptualized as a form of memory. The backpropagation mechanism links data to errors, while the state of the optimizer, such as momentum, serves as a memory construct. The Continuum Memory System (CMS) categorizes memory into modules that update at different frequencies, endowing the model with a temporal depth that mirrors the brain’s memory architecture.

This innovative framework allows the model to assimilate new information without overwriting existing knowledge. The learning process is decomposed into layers, each equipped with its own gradient flow and objectives. For instance, the model may be structured into three distinct layers, each contributing to the overall functionality while maintaining localized memory for step-by-step parameter updates.

Benchmark Performance and Evaluation

Central to this research is the implementation of the HOPE architecture, which operationalizes nested learning principles. HOPE integrates long-term memory modules termed Titans, which store information based on its novelty to the model. This architecture stratifies various types of memory and utilizes CMS blocks to facilitate larger context windows. In practice, fast layers handle real-time inputs, while slower layers distill essential information for long-term retention, enabling the model to adaptively modify its update protocols as it learns. This approach significantly deviates from traditional “pretrain and freeze” models.

The team rigorously evaluated HOPE on tasks encompassing language modeling and reasoning, employing models with 1.3 billion parameters trained on a dataset comprising 100 billion tokens. The results indicated that HOPE not only surpassed Transformer++, but also outperformed contemporary architectures such as RetNet and DeltaNet in various performance metrics. The evaluation demonstrated that HOPE achieved the lowest loss and highest benchmark scores, although the margins were modest.

Moreover, HOPE excelled in long-context scenarios and specific retrieval tasks, necessitating the model to sift through expansive text corpora to identify particular items. The tests spanned parameter counts from 340 million to 1.3 billion, and HOPE displayed consistent performance gains. Notably, the authors assert that HOPE can outperform both conventional transformers and modern recurrent networks, with independently reproducible results available on GitHub.

In summary, nested learning represents a significant stride in the evolution of AI models, addressing the limitations of current architectures in continuous learning environments. By mimicking the brain’s layering of memory processes, this approach offers a promising pathway for developing more adaptable and robust AI systems. The implications of this research extend beyond theoretical advancements, presenting opportunities for practical applications across various domains of artificial intelligence.

AI Tools

AI Content Workflows Transition to Brand-Safe Standards with Enhanced Clarity and Authenticity

AI-assisted writing workflows are evolving to prioritize brand safety and authenticity, shifting focus from speed to clarity and nuanced tone, ensuring higher-quality content outputs.

Staff1 day ago

AI Marketing

Cvent Reveals AI-First Strategy as 70% of Event Planners Shift to AI Venue Searches

Cvent reveals a shift as over 70% of event planners now utilize AI for venue searches, emphasizing the critical need for hotels to optimize...

Sofía Méndez3 days ago

Anthropic Reveals Claude Sonnet 4.5’s Emotion Signals Impacting AI Behavior and Decision-Making

Anthropic's Claude Sonnet 4.5 reveals 171 emotion-like signals that shape AI decision-making, raising critical implications for educational technology and workforce applications.

Staff5 days ago

Google Study Reveals AI Benchmarks Require Over 10 Raters for Reliable Evaluations

Google Research reveals that over 10 raters per AI test example are essential for reliable evaluations, challenging current benchmarking practices.

Staff6 days ago

AI Technology

Intel and Dell Highlight AI PC Upgrades to Cut Cloud Costs and Boost Efficiency

Intel and Dell unveil new AI-capable PCs designed to run smaller language models locally, slashing cloud costs and enhancing operational efficiency for businesses.

Staff2 April, 2026

AI Technology

Fujitsu Unveils Plans for 1.4nm AI Chip with ¥58 Billion Investment from Japan’s NEDO

Fujitsu announces a ¥58 billion investment to develop a 1.4nm neural processing unit for AI inference, backed by Japan's NEDO to enhance domestic chip...

Staff31 March, 2026

AI Research

NeurIPS Reverses Controversial Restrictions After Chinese Researchers Threaten Boycott

NeurIPS reverses restrictions on Chinese researchers after backlash, potentially reshaping international AI collaboration amid rising geopolitical tensions.

Staff27 March, 2026

AI Regulation

NeurIPS Issues Apology After Chinese Boycott Over US Sanctions Policy on Submissions

NeurIPS apologizes for a controversial policy barring submissions from 873 Chinese entities, amidst widespread boycotts from China's tech community.

Staff27 March, 2026

AIPRESSA.COM

AI Research

Google Reveals Nested Learning to Combat Catastrophic Forgetting in LLMs

Technical Approach

Benchmark Performance and Evaluation

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

AI Tools

AI Content Workflows Transition to Brand-Safe Standards with Enhanced Clarity and Authenticity

AI Marketing

Cvent Reveals AI-First Strategy as 70% of Event Planners Shift to AI Venue Searches

Top Stories

Anthropic Reveals Claude Sonnet 4.5’s Emotion Signals Impacting AI Behavior and Decision-Making

Top Stories

Google Study Reveals AI Benchmarks Require Over 10 Raters for Reliable Evaluations

AI Technology

Intel and Dell Highlight AI PC Upgrades to Cut Cloud Costs and Boost Efficiency

AI Technology

Fujitsu Unveils Plans for 1.4nm AI Chip with ¥58 Billion Investment from Japan’s NEDO

AI Research

NeurIPS Reverses Controversial Restrictions After Chinese Researchers Threaten Boycott

AI Regulation

NeurIPS Issues Apology After Chinese Boycott Over US Sanctions Policy on Submissions