Data Party THU
Data Party THU
Jan 21, 2026 · Artificial Intelligence

What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM

Analyzing recent DeepSeek flashmla repository commits, the article uncovers that the mysterious Model1 likely corresponds to DeepSeek‑V4, detailing architectural shifts to a 512‑dimensional head, full support for NVIDIA Blackwell GPUs, token‑level sparse MLA, and new mechanisms such as Value Vector Position Awareness and Engram.

DeepSeekDeepSeek V4GPU Optimization
0 likes · 6 min read
What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM