Machine Learning Algorithms & Natural Language Processing
Mar 3, 2026 · Artificial Intelligence
Identity Constraint Beats DeepSeek mHC After 150B Tokens: A Surprising Reversal
Extensive experiments on DeepSeek's 1.7B and 8B models reveal that replacing the manifold hyper‑connection (mHC) constraint with a simple identity matrix consistently outperforms the original mHC, improves signal flow stability, and avoids the collapse caused by repeated Sinkhorn‑Knopp projections.
DeepSeekHyper-ConnectionIdentity
0 likes · 12 min read
