AI Research

Apple Unveils ParaRNN, Achieving 665× Faster Training for 7B-Parameter RNNs

Apple’s new ParaRNN framework accelerates RNN training by 665×, enabling efficient large language models with up to 7 billion parameters.

Staff

Published

2 hours ago

Apple researchers have made a significant breakthrough in the training efficiency of Recurrent Neural Networks (RNNs), marking the first time that large-scale training for these models has become feasible. Their new framework, detailed in a paper titled “ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models,” has been accepted for presentation at ICLR 2026. This advancement allows practitioners to explore a broader range of architectures when designing large language models (LLMs), particularly in scenarios where computational resources are limited.

The new ParaRNN framework achieves a remarkable 665× speedup over conventional sequential training methods, facilitating the training of RNNs with up to 7 billion parameters. This development enhances the competitive performance of these classical models against transformer architectures, which have dominated the field of natural language processing in recent years. The researchers have made their codebase available as an open-source framework, enabling both researchers and practitioners to delve into efficient sequence modeling.

Traditionally, the sequential nature of RNNs has limited their scalability, as training could not be parallelized along the sequence length. While RNNs provide efficient and constant-time token generation during inference, their training process has been a bottleneck due to its step-by-step computational requirements. In contrast, transformers leverage attention mechanisms that allow for simultaneous processing of input tokens, but at the cost of increased computational complexity that grows quadratically with sequence length.

To address these challenges, Apple’s researchers have redefined the recurrence relationship of RNNs, adopting a linear approach that facilitates parallelization. This innovation mirrors techniques used in selective state space models (SSMs), which streamline the training process by employing linear operations. The researchers have introduced adaptations of classical GRUs and LSTMs, known as ParaGRU and ParaLSTM, which utilize structured Jacobians to maintain computational efficiency while enhancing expressivity.

One of the pivotal techniques employed in the ParaRNN framework is the application of Newton’s method, a classical numerical technique for solving nonlinear equations. By framing the entire sequence of hidden states as a single system of equations to be solved simultaneously, this methodology allows for iterative refinements that maintain the nonlinear characteristics of traditional RNNs while taking advantage of parallel processing capabilities.

Empirical results have demonstrated that with just three iterations of Newton’s method, the adapted RNNs can achieve comparable hidden state evolution to that of traditional RNN training, significantly reducing training time. The researchers have conducted experiments that involved training models ranging from 400 million to 7 billion parameters, confirming that even classical RNNs can perform competitively when trained at scale. The outcomes indicate that ParaGRU and ParaLSTM achieve perplexity and performance metrics on par with both transformers and state-of-the-art SSMs.

While the newly developed framework is designed to facilitate large-scale training, it still requires careful engineering to be practical. The parallel reduction algorithm central to this approach must efficiently handle the storage and multiplication of Jacobian matrices arising from the linearization process. To mitigate the complexity associated with generic RNNs, the researchers have prioritized structured Jacobians, which significantly reduce the computational demands of the training process.

In terms of application, the real benefits of RNNs become particularly evident during inference. RNNs maintain high throughput regardless of context length, making them an attractive option for applications that prioritize rapid generation. In contrast to transformers, whose generation time increases with sequence length, RNNs’ constant-time token generation leads to more efficient performance overall.

Moreover, incorporating nonlinearities into the recurrence definitions has resulted in enhanced performance on tasks that require state tracking and retrieval capabilities. This capability highlights the advantages of nonlinear RNNs over purely linear models, underscoring the importance of expressivity in modern sequence modeling. The results indicate that classical RNNs, once constrained by computational limitations, can now scale effectively and potentially rival the performance of advanced transformer models.

As the landscape of artificial intelligence continues to evolve, the ParaRNN framework presents an opportunity to revisit nonlinear recurrence in modern sequence modeling, paving the way for novel architectures and enhanced modeling capabilities. With this development, Apple has not only advanced the field of RNN training but has also laid the groundwork for future exploration in recurrent models at scale.

AI Technology

Apple Appoints John Ternus CEO, Prioritizing AI Over AR Partnerships in 2026

Apple appoints John Ternus as CEO, signaling a shift to prioritize in-house AI development over AR partnerships amid a 0.5% dip in shares.

Staff9 hours ago

Perplexity CEO Asserts iPhone Remains Strong Amid AI Evolution, Dismisses Disruption Claims

Perplexity CEO Aravind Srinivas defends the iPhone's resilience against AI disruption, emphasizing its role as a vital "digital passport" amid evolving technology.

Staff2 days ago

AI Research

Apple Unveils Groundbreaking ML Advances at ICLR 2026, Including 665× Faster RNN Training

Apple unveils revolutionary research at ICLR 2026, showcasing a 665× faster training for RNNs, positioning itself as a leader in AI advancements.

Staff3 days ago

AI Technology

Apple Names John Ternus CEO as Mobile Phones Shift Toward AI-Driven Interaction Systems

Apple appoints John Ternus as CEO, steering the company toward an AI-driven future as mobile phones evolve from mere devices to integrated interaction systems.

Staff4 days ago

Apple Names John Ternus CEO, Faces Pressure on Delayed AI Initiatives

Apple names John Ternus CEO in September 2026, as pressure mounts over delayed AI initiatives, with shares trading at a premium P/E ratio of...

Staff4 days ago

AI Technology

Apple’s New CEO John Ternus Faces Major Challenges Ahead of WWDC AI Reveal

Apple's new CEO John Ternus faces high expectations for AI innovation as he prepares for a pivotal WWDC reveal amid growing competition and lagging...

Staff4 days ago

AI Government

US Government Expands AI-Driven Surveillance, Spending $165 Billion Through DHS

US government accelerates AI-driven surveillance with $165 billion funding through DHS, raising serious privacy concerns and ethical implications.

Staff5 days ago

AI Technology

Apple Sends 200 Engineers to AI Bootcamp for Major Siri Overhaul Ahead of WWDC 2023

Apple deploys 200 engineers for an urgent retraining initiative to revamp Siri ahead of WWDC 2023, aiming to enhance its AI capabilities by 2026.

Staff17 April, 2026

AIPRESSA.COM

AI Research

Apple Unveils ParaRNN, Achieving 665× Faster Training for 7B-Parameter RNNs

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Technology

Apple Appoints John Ternus CEO, Prioritizing AI Over AR Partnerships in 2026

Top Stories

Perplexity CEO Asserts iPhone Remains Strong Amid AI Evolution, Dismisses Disruption Claims

AI Research

Apple Unveils Groundbreaking ML Advances at ICLR 2026, Including 665× Faster RNN Training

AI Technology

Apple Names John Ternus CEO as Mobile Phones Shift Toward AI-Driven Interaction Systems

Top Stories

Apple Names John Ternus CEO, Faces Pressure on Delayed AI Initiatives

AI Technology

Apple’s New CEO John Ternus Faces Major Challenges Ahead of WWDC AI Reveal

AI Government

US Government Expands AI-Driven Surveillance, Spending $165 Billion Through DHS

AI Technology

Apple Sends 200 Engineers to AI Bootcamp for Major Siri Overhaul Ahead of WWDC 2023