Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 3, 2026 · Artificial Intelligence

Identity Constraint Beats DeepSeek mHC After 150B Tokens: A Surprising Reversal

Extensive experiments on DeepSeek's 1.7B and 8B models reveal that replacing the manifold hyper‑connection (mHC) constraint with a simple identity matrix consistently outperforms the original mHC, improves signal flow stability, and avoids the collapse caused by repeated Sinkhorn‑Knopp projections.

DeepSeekHyper-ConnectionIdentity
0 likes · 12 min read
Identity Constraint Beats DeepSeek mHC After 150B Tokens: A Surprising Reversal
Design Hub
Design Hub
Jan 2, 2026 · Artificial Intelligence

DeepSeek’s “Mathematical Tight‑Fit” Tames AI: Constraints Drive Performance Gains

DeepSeek’s new mHC architecture replaces unconstrained hyper‑connections with manifold‑constrained doubly‑stochastic matrices, stabilizing large‑scale training, reducing signal explosion from 3000× to 1.6×, and delivering consistent accuracy improvements across BBH, DROP, GSM8K, and MMLU benchmarks while adding only 6.7% training overhead.

AI training stabilityDeepSeekhyper-connections
0 likes · 10 min read
DeepSeek’s “Mathematical Tight‑Fit” Tames AI: Constraints Drive Performance Gains