DeepSeek-V4 Raises the Bar: 1.6T‑Parameter Open‑Source Model Challenges Closed‑Source Giants
DeepSeek-V4 introduces two open‑source LLMs—V4‑Pro with 1.6 trillion total parameters and V4‑Flash with 284 billion—offering a 1 million‑token context window, hybrid attention, multi‑head compression, and a new Muon optimizer, all under an MIT license that rivals top closed‑source models.
DeepSeek has released the V4 series preview, comprising two models— DeepSeek-V4-Pro and DeepSeek-V4-Flash —both open‑source under the MIT license and free for commercial use.
Core Parameters
DeepSeek-V4-Pro : 1.6 T total parameters, 49 B active parameters, 1 million‑token context window.
DeepSeek-V4-Flash : 284 B total parameters, 13 B active parameters, 1 million‑token context window.
Architectural Innovations
Hybrid Attention : combines full attention and sliding‑window mechanisms to optimize short‑ and long‑range tasks without sacrificing efficiency.
Multi‑head Compression (mHC) : compresses multiple attention heads into fewer representations, reducing KV cache size while preserving key information, enabling practical use of the 1 million‑token context.
Muon Optimizer : a second‑order optimizer that converges faster and more stably than mainstream AdamW variants in large‑scale MoE training.
Training Pipeline
DeepSeek employs a two‑stage post‑training process:
Stage 1: General capability alignment (SFT).
Stage 2: Inference capability reinforcement (RLHF + inference mode), offering two modes— Think Max for deep reasoning and Think Fast for rapid response.
Performance
Benchmark results released by DeepSeek show V4‑Pro matching or approaching the performance of leading closed‑source models across multiple tasks. The Flash variant delivers comparable inference speed to the Pro model on simple agent tasks while being faster and cheaper to run.
The community response highlights that V4 provides a locally deployable, high‑performance alternative to models like GPT‑5.
Implications
DeepSeek’s strategy aims to deliver world‑class models at roughly one‑tenth the cost; reported training expenses are under $6 million, far lower than the tens‑of‑millions required for comparable closed‑source efforts.
All model weights are available on Hugging Face and ModelScope (FP8 / FP4+FP8 mixed precision). Users can run the models locally via Ollama on macOS or PC, and the codebase is open on GitHub (github.com/deepseek-ai).
Deployment & Pricing
Online demo: chat.deepseek.com (iOS/Android apps also released).
Local deployment: weights on Hugging Face/ModelScope, Ollama support, GitHub source.
API pricing: Flash version is extremely low‑cost for large‑scale integration; Pro version pricing is pending.
In summary, DeepSeek‑V4 pushes open‑source AI to a 1 million‑token, 1.6 T‑parameter frontier under a permissive MIT license, redefining what constitutes a top‑tier model.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
