Tagged articles
2 articles
Page 1 of 1
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Jun 4, 2026 · Artificial Intelligence

How DeepSeek‑V4 Achieves Million‑Token Context via Aggressive KV‑Cache Compression

DeepSeek‑V4 reaches a million‑token context window by aggressively compressing its KV‑cache and employing a hybrid attention scheme that combines Compressed Sparse Attention (CSA) for selective top‑k retrieval with Heavily Compressed Attention (HCA) for full‑attention over heavily merged entries, alongside mixed‑precision storage and other engineering optimizations.

Compressed Sparse AttentionDeepSeek V4Heavily Compressed Attention
0 likes · 7 min read
How DeepSeek‑V4 Achieves Million‑Token Context via Aggressive KV‑Cache Compression
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Apr 25, 2026 · Artificial Intelligence

How DeepSeek V4 Advances Structured Optimization in the Large‑Model Era

The article analyses DeepSeek V4’s architectural innovations—including Compressed Sparse Attention, Heavily Compressed Attention, a cross‑layer MoE design, and an Agent‑RL framework with Generative Reward Models and multi‑teacher distillation—while comparing its long‑context capabilities and efficiency to rival LLMs such as GLM, Kimi, Claude, GPT and Gemini.

Agent Reinforcement LearningCompressed Sparse AttentionDeepSeek V4
0 likes · 7 min read
How DeepSeek V4 Advances Structured Optimization in the Large‑Model Era