TLDR DeepSeek introduced Manifold-Constrained Hyper-Connections (mHC) to improve large-model training scalability and efficiency. The mHC method was tested on 3BTLDR DeepSeek introduced Manifold-Constrained Hyper-Connections (mHC) to improve large-model training scalability and efficiency. The mHC method was tested on 3B

DeepSeek Introduces mHC Architecture to Improve Large Model Training

TLDR

  • DeepSeek introduced Manifold-Constrained Hyper-Connections (mHC) to improve large-model training scalability and efficiency.
  • The mHC method was tested on 3B, 9B, and 27B parameter models, showing stable performance without added computational cost.
  • mHC builds on ByteDance’s 2024 hyper-connection architecture by adding a manifold constraint to reduce memory overhead.
  • CEO Liang Wenfeng co-authored and uploaded the paper, reaffirming his direct involvement in DeepSeek’s technical development.
  • Industry observers expect a new DeepSeek model release ahead of Spring Festival 2026, based on the company’s publication patterns.

DeepSeek has released a new AI training method, Manifold-Constrained Hyper-Connections (mHC), in a paper uploaded to arXiv by CEO Liang Wenfeng. The architecture aims to improve training scalability for large models while keeping computational costs low. Researchers tested the method on models with 3, 9, and 27 billion parameters, showing consistent training efficiency. This comes as the company is expected to launch a new model before the Spring Festival in February 2026.

DeepSeek Builds on ResNet and Hyper-Connection Foundations

According to a report by SCMP, the mHC method enhances earlier hyper-connection (HC) designs first proposed by ByteDance in 2024 as an improvement to ResNet. ResNet allows deeper neural networks by preserving signal strength across layers, but faces challenges in maintaining efficient learning at large scale. ByteDance’s HC improved signal flow but didn’t fully address memory usage in larger models.

DeepSeek introduced a manifold constraint to limit expansion and better control memory and compute costs during training. This adjustment preserved the HC benefits while making the network suitable for larger training tasks. Researchers wrote that mHC maintained performance without increasing computational overhead per unit during model training at scale.

Lead authors Zhenda Xie, Yixuan Wei, and Huanqi Cao explained that the system enables stable deep learning without collapse. They confirmed mHC works with minimal infrastructure adjustments, making it efficient for broader deployment. The architecture was tested across multiple model sizes, confirming the technique’s adaptability and reliability. DeepSeek reported that the method handled signal preservation and scalability better than previous HC-based frameworks.

Liang Wenfeng Directly Leads Technical Advancement

CEO Liang Wenfeng was listed as the final author and uploaded the paper himself, continuing his role in major DeepSeek research. He has consistently shared technical papers linked to the company’s top models, such as R1 and V3 on arXiv. Other researchers typically upload supporting studies not directly tied to product development.

His involvement in this paper signals continued leadership in the company’s core AI work. The release underscores DeepSeek’s approach of linking internal research closely with future product direction. Florian Brand, a PhD researcher at Trier University, said DeepSeek papers often indicate what models are coming next.

He noted that the R1 model followed a similar pattern of publication and then launch. Liang’s involvement has again drawn attention from analysts watching DeepSeek’s release schedule. The company has not announced a date, but its publication strategy has become predictable. DeepSeek has remained quiet on details, but research uploads suggest new systems are under development.

The post DeepSeek Introduces mHC Architecture to Improve Large Model Training appeared first on Blockonomi.

Piyasa Fırsatı
Hyperlane Logosu
Hyperlane Fiyatı(HYPER)
$0.12545
$0.12545$0.12545
+2.64%
USD
Hyperlane (HYPER) Canlı Fiyat Grafiği
Sorumluluk Reddi: Bu sitede yeniden yayınlanan makaleler, halka açık platformlardan alınmıştır ve yalnızca bilgilendirme amaçlıdır. MEXC'nin görüşlerini yansıtmayabilir. Tüm hakları telif sahiplerine aittir. Herhangi bir içeriğin üçüncü taraf haklarını ihlal ettiğini düşünüyorsanız, kaldırılması için lütfen service@support.mexc.com ile iletişime geçin. MEXC, içeriğin doğruluğu, eksiksizliği veya güncelliği konusunda hiçbir garanti vermez ve sağlanan bilgilere dayalı olarak alınan herhangi bir eylemden sorumlu değildir. İçerik, finansal, yasal veya diğer profesyonel tavsiye niteliğinde değildir ve MEXC tarafından bir tavsiye veya onay olarak değerlendirilmemelidir.

Ayrıca Şunları da Beğenebilirsiniz

The Channel Factories We’ve Been Waiting For

The Channel Factories We’ve Been Waiting For

The post The Channel Factories We’ve Been Waiting For appeared on BitcoinEthereumNews.com. Visions of future technology are often prescient about the broad strokes while flubbing the details. The tablets in “2001: A Space Odyssey” do indeed look like iPads, but you never see the astronauts paying for subscriptions or wasting hours on Candy Crush.  Channel factories are one vision that arose early in the history of the Lightning Network to address some challenges that Lightning has faced from the beginning. Despite having grown to become Bitcoin’s most successful layer-2 scaling solution, with instant and low-fee payments, Lightning’s scale is limited by its reliance on payment channels. Although Lightning shifts most transactions off-chain, each payment channel still requires an on-chain transaction to open and (usually) another to close. As adoption grows, pressure on the blockchain grows with it. The need for a more scalable approach to managing channels is clear. Channel factories were supposed to meet this need, but where are they? In 2025, subnetworks are emerging that revive the impetus of channel factories with some new details that vastly increase their potential. They are natively interoperable with Lightning and achieve greater scale by allowing a group of participants to open a shared multisig UTXO and create multiple bilateral channels, which reduces the number of on-chain transactions and improves capital efficiency. Achieving greater scale by reducing complexity, Ark and Spark perform the same function as traditional channel factories with new designs and additional capabilities based on shared UTXOs.  Channel Factories 101 Channel factories have been around since the inception of Lightning. A factory is a multiparty contract where multiple users (not just two, as in a Dryja-Poon channel) cooperatively lock funds in a single multisig UTXO. They can open, close and update channels off-chain without updating the blockchain for each operation. Only when participants leave or the factory dissolves is an on-chain transaction…
Paylaş
BitcoinEthereumNews2025/09/18 00:09
Successful Medical Writing from Protocol to CTD Training Course: Understand International Guidelines and Standards (Mar 23rd – Mar 24th, 2026) – ResearchAndMarkets.com

Successful Medical Writing from Protocol to CTD Training Course: Understand International Guidelines and Standards (Mar 23rd – Mar 24th, 2026) – ResearchAndMarkets.com

DUBLIN–(BUSINESS WIRE)–The “Successful Medical Writing – from Protocol to CTD Training Course (Mar 23rd – Mar 24th, 2026)” training has been added to ResearchAndMarkets
Paylaş
AI Journal2026/01/03 01:15
Why The Green Bay Packers Must Take The Cleveland Browns Seriously — As Hard As That Might Be

Why The Green Bay Packers Must Take The Cleveland Browns Seriously — As Hard As That Might Be

The post Why The Green Bay Packers Must Take The Cleveland Browns Seriously — As Hard As That Might Be appeared on BitcoinEthereumNews.com. Jordan Love and the Green Bay Packers are off to a 2-0 start. Getty Images The Green Bay Packers are, once again, one of the NFL’s better teams. The Cleveland Browns are, once again, one of the league’s doormats. It’s why unbeaten Green Bay (2-0) is a 8-point favorite at winless Cleveland (0-2) Sunday according to betmgm.com. The money line is also Green Bay -500. Most expect this to be a Packers’ rout, and it very well could be. But Green Bay knows taking anyone in this league for granted can prove costly. “I think if you look at their roster, the paper, who they have on that team, what they can do, they got a lot of talent and things can turn around quickly for them,” Packers safety Xavier McKinney said. “We just got to kind of keep that in mind and know we not just walking into something and they just going to lay down. That’s not what they going to do.” The Browns certainly haven’t laid down on defense. Far from. Cleveland is allowing an NFL-best 191.5 yards per game. The Browns gave up 141 yards to Cincinnati in Week 1, including just seven in the second half, but still lost, 17-16. Cleveland has given up an NFL-best 45.5 rushing yards per game and just 2.1 rushing yards per attempt. “The biggest thing is our defensive line is much, much improved over last year and I think we’ve got back to our personality,” defensive coordinator Jim Schwartz said recently. “When we play our best, our D-line leads us there as our engine.” The Browns rank third in the league in passing defense, allowing just 146.0 yards per game. Cleveland has also gone 30 straight games without allowing a 300-yard passer, the longest active streak in the NFL.…
Paylaş
BitcoinEthereumNews2025/09/18 00:41