DeepSeek Introduces Mega MoE and FP4 Indexer – Inside the New GPU Fusion Kernel
DeepSeek's latest DeepGEMM update adds Mega MoE, a fused GPU kernel that collapses the entire Mixture‑of‑Experts pipeline and overlaps computation with NVLink communication, while also unveiling an FP4 indexer and FP8×FP4 precision experiments, signaling a push toward highly efficient large‑scale AI training.
