PaperAgent
PaperAgent
Jan 21, 2026 · Artificial Intelligence

Inside DeepSeek’s FlashMLA Update: What’s New in the MODEL1 Architecture

DeepSeek’s recent FlashMLA update introduces the new MODEL1, featuring a tighter KV-Cache layout, an extra two-stage cache, and a fixed 512×512 head dimension, with four code changes detailed in a public GitHub commit and illustrated by comparative diagrams.

AI ArchitectureDeepSeekFlashMLA
0 likes · 3 min read
Inside DeepSeek’s FlashMLA Update: What’s New in the MODEL1 Architecture
Baobao Algorithm Notes
Baobao Algorithm Notes
May 9, 2024 · Artificial Intelligence

Inside Deepseek‑V2: How Multi‑Head Latent Attention Cuts KV‑Cache and Boosts Performance

This article provides an in‑depth technical analysis of Deepseek‑V2, covering its 236B parameter size, Multi‑Head Latent Attention optimization that reduces KV‑cache memory, architectural details, training pipelines, infrastructure choices, and performance results on benchmarks such as MMLU and instruction following.

AI ArchitectureDeepSeekLarge Language Model
0 likes · 17 min read
Inside Deepseek‑V2: How Multi‑Head Latent Attention Cuts KV‑Cache and Boosts Performance