DeepSeek Unveils mHC Architecture for Enhanced Large-Model Training Efficiency

DeepSeek launches its mHC architecture, enhancing large-model training efficiency while reducing computational costs, with consistent performance across 3-27 billion parameter models.

Staff

Published

1 January, 2026

DeepSeek has unveiled its new AI training methodology, Manifold-Constrained Hyper-Connections (mHC), aiming to enhance the scalability and efficiency of large-model training. This approach, detailed in a paper uploaded to arXiv by CEO Liang Wenfeng, targets improvements in training capabilities while minimizing computational costs. The technique was evaluated across models with 3 billion, 9 billion, and 27 billion parameters, demonstrating consistent performance and training efficiency.

The mHC architecture builds upon a foundation established by ByteDance in 2024, which introduced hyper-connection (HC) designs to improve the ResNet framework. While ResNet enables the construction of deeper neural networks by preserving signal strength across layers, it encounters difficulties in efficient learning as model size increases. ByteDance’s hyper-connection architecture improved signal flow but did not fully mitigate memory usage in expansive models.

DeepSeek’s introduction of a manifold constraint aims to control memory and computational costs during training, while preserving the benefits of hyper-connections. This innovation has been reported to maintain performance levels without incurring additional computational overhead during large-scale model training. According to the authors, Zhenda Xie, Yixuan Wei, and Huanqi Cao, mHC facilitates stable deep learning without risking system collapse, proving adaptable across various model sizes.

Liang Wenfeng’s direct involvement in the development of mHC reaffirms his commitment to DeepSeek’s technical progress. As the final author of the paper, he has a history of linking research outputs to the company’s key models, such as R1 and V3, also shared on arXiv. Other researchers typically contribute supporting studies without direct ties to product development, highlighting Liang’s active role in steering the company’s core AI advancements. His consistent engagement has drawn interest from analysts tracking DeepSeek’s research and product release patterns.

Florian Brand, a PhD researcher at Trier University, noted that DeepSeek’s publication trends often forecast future model launches. The company’s previous model, R1, followed a similar release strategy, with a publication preceding its public availability. Although DeepSeek has yet to announce a specific release date for its upcoming model, the predictable nature of its publication approach suggests new systems are already in the pipeline, likely set to emerge ahead of the Spring Festival in February 2026.

The anticipation surrounding DeepSeek’s next model release underscores the broader trend in AI development, where foundational research and practical applications increasingly intersect. With mHC poised to enhance training efficiency without escalating costs, the implications of this advancement extend beyond DeepSeek, potentially influencing industry practices on a larger scale.

Alibaba and ByteDance Launch Qwen-Image-2.0 and Seedream 5.0, Transforming AI Image Generation

Alibaba and ByteDance unveil Qwen-Image-2.0 and Seedream 5.0, revolutionizing AI image generation with enhanced controllability and adaptability ahead of the Spring Festival.

Staff4 hours ago

ByteDance Launches Seedance 2.0, Disrupting Hollywood with AI-Generated Content

ByteDance's Seedance 2.0 generates high-quality videos mimicking Hollywood scenes, raising concerns over copyright and the future of traditional filmmaking.

Staff8 hours ago

China’s AI Governance: Mixed Stakeholder Model Challenges Top-Down Stereotypes

China's AI governance model, shaped by state, private sector, and societal influences, sees 23 of the world's top AI products from Chinese firms generating...

Staff17 hours ago

AI Generative

Disney and Paramount Issue Cease-and-Desist to ByteDance Over AI Copyright Concerns

Disney and Paramount escalate legal action against ByteDance, issuing cease-and-desist letters over Seedance 2.0's alleged unauthorized use of copyrighted characters.

Staff21 hours ago

Disney Sends Cease-and-Desist to ByteDance Over Seedance 2.0’s Use of Its Characters

Disney files a cease-and-desist against ByteDance's Seedance 2.0 for creating AI-generated videos using its characters, escalating the copyright battle in tech.

Staff24 hours ago

AI Technology

Shanghai’s Model Speed Space Launches MiniMax M2.5, Promising 100 TPS for AI Innovations

MiniMax launches the M2.5, achieving 100 TPS and transforming AI deployment costs to $0.3 input and $2.4 output per million tokens, enhancing operational efficiency.

Staff1 day ago

AI Cybersecurity

AI Cyber Attacks on Supply Chains Surge 263% in Asia-Pacific, Group-IB Reports

Group-IB's report reveals a staggering 263% surge in supply chain cyber attacks across Asia-Pacific, reshaping the cybersecurity landscape with interconnected threats.

Rachel Torres1 day ago

Disney Sends Cease-and-Desist to ByteDance Over Alleged Copyright Violations in Seedance 2.0

Disney has issued a cease-and-desist to ByteDance, claiming its Seedance 2.0 AI unlawfully uses copyrighted characters from iconic franchises like Star Wars and Marvel.

Staff2 days ago

AIPRESSA.COM

Top Stories

DeepSeek Unveils mHC Architecture for Enhanced Large-Model Training Efficiency

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

Top Stories

Alibaba and ByteDance Launch Qwen-Image-2.0 and Seedream 5.0, Transforming AI Image Generation

Top Stories

ByteDance Launches Seedance 2.0, Disrupting Hollywood with AI-Generated Content

Top Stories

China’s AI Governance: Mixed Stakeholder Model Challenges Top-Down Stereotypes

AI Generative

Disney and Paramount Issue Cease-and-Desist to ByteDance Over AI Copyright Concerns

Top Stories

Disney Sends Cease-and-Desist to ByteDance Over Seedance 2.0’s Use of Its Characters

AI Technology

Shanghai’s Model Speed Space Launches MiniMax M2.5, Promising 100 TPS for AI Innovations

AI Cybersecurity

AI Cyber Attacks on Supply Chains Surge 263% in Asia-Pacific, Group-IB Reports

Top Stories

Disney Sends Cease-and-Desist to ByteDance Over Alleged Copyright Violations in Seedance 2.0