Connect with us

Hi, what are you looking for?

Top Stories

DeepSeek’s Engram Breakthrough Enhances AI Performance by 3.4-5 Points, Reduces HBM Dependency

DeepSeek’s Engram boosts AI performance by 3.4-5 points while reducing reliance on high-bandwidth memory, revolutionizing efficiency in long-context tasks.

DeepSeek has unveiled a new technical methodology named Engram, which offers a novel approach for artificial intelligence models to utilize a queryable database of information stored in system memory. Released on the company’s GitHub page, the paper outlines how Engram improves performance in long-context queries by enabling AI models to commit data sequences to static memory. This reduces the computational load on graphical processing units (GPUs) by allowing them to focus on more complex tasks, thereby decreasing the reliance on high-bandwidth memory (HBM), which is increasingly under supply pressure.

The research describes how N-grams, statistical sequences of words, are integrated into the models’ neural networks, forming a queryable memory bank. Engram allows AI models to access facts directly instead of reasoning them out, which is computationally expensive. By alleviating the need for GPUs to handle basic memory tasks, DeepSeek aims to address the ongoing demand for HBM, particularly as the supply remains constrained.

According to the paper, an Engram-based model scaled to nearly 27 billion parameters demonstrated superior performance in long-context training compared to standard Mixture of Experts (MoE) architectures. Traditional MoE models often require extensive reasoning to reconstruct data with each query reference, leading to computational waste. Engram’s architecture permits the storage of facts externally, enhancing efficiency.

The Engram model allows AI systems to simply check, “Do I already have this data?” instead of engaging in extensive reasoning processes for each query. The paper emphasizes that this method minimizes unnecessary computations, freeing up resources for higher-level reasoning tasks.

In a comparative analysis, DeepSeek found that reallocating around 20%–25% of the sparse parameter budget to Engram optimized performance, achieving results comparable to pure MoE models. This suggests that balancing memory and computational resources could be key in designing efficient AI systems moving forward.

DeepSeek’s exploration extended to what they termed the “Infinite Memory Regime,” where they maintained a fixed computational budget while attaching a near-infinite number of conditional memory parameters. This led to a linear performance increase with memory size, indicating that as memory expands, performance can improve without necessitating higher computational expenses.

These findings could have substantial implications for the AI industry, as reliance on HBM may lessen if AI models can efficiently leverage system memory through methodologies like Engram. The results from an Engram-27B parameter model indicated that it outperformed a standard 27B MoE model in knowledge-intensive tasks, with an increase of 3.4 to 4 points in performance and a 3.7 to 5 point improvement in reasoning tasks. Notably, in long-context benchmarks, the Engram model’s accuracy reached 97%, a significant leap from the MoE model’s 84.2%.

As DeepSeek prepares to announce a new AI model in the coming weeks, the implementation of Engram may redefine efficiency standards in AI applications. However, the broader market implications could also raise concerns about the existing DRAM supply crisis, as the shift toward system DRAM might exacerbate ongoing shortages. With DeepSeek suggesting that conditional memory functions will be essential for next-generation models, the future direction of AI development could hinge on the successful deployment of these methodologies.

In summary, if Engram delivers as intended in real-world applications, it could signify a pivotal moment for AI technology, moving away from traditional memory constraints and paving the way for more robust and efficient models.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Business

Nvidia invests $30B in OpenAI and backs 10 AI startups, navigating a $4.5T GPU boom while raising questions about industry influence and funding dynamics.

AI Technology

Marvell Technology is set to surge with a projected 50% stock upside by 2026, capitalizing on custom AI processors and anticipated revenue growth to...

AI Technology

Dell Technologies achieves a record $64 billion in AI revenue for Q4, driven by unprecedented demand and a strategic shift amid tight supply constraints.

Top Stories

DeepSeek withholds its V4 AI model from Nvidia and AMD while granting early access to Huawei, reinforcing China's push for self-reliance amid U.S. trade...

Top Stories

Anthropic accuses MiniMax, DeepSeek, and Moonshot AI of operating 24,000 fake accounts to steal Claude's proprietary features through 16M illicit exchanges.

AI Technology

AMD's EPYC CPUs drive a record $5.4 billion in Q4 revenue, fueled by soaring demand from agentic AI workloads as CPUs take center stage...

AI Technology

Nvidia's stock could surge 60-120% over the next five years, fueled by robust AI growth and sustained demand for its chips in a $1...

Top Stories

Sarvam AI develops advanced AI models with just 40 researchers and 4,000 GPUs, showcasing a frugal innovation strategy that challenges industry norms.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.