DeepSeek-V4 Launches with 1M Token Context and Leading Open-Source Agent – A Chinese AI Milestone

DeepSeek has unveiled the V4 preview, offering two open‑source large language models—Pro (1.6 T parameters) and Flash (284 B)—both supporting 1 million‑token context, sparse‑attention efficiency gains, top‑ranked Agent capabilities, and competitive reasoning performance, marking a major milestone for Chinese AI.

AI Era Action Guide
AI Era Action Guide
AI Era Action Guide
DeepSeek-V4 Launches with 1M Token Context and Leading Open-Source Agent – A Chinese AI Milestone

DeepSeek‑V4 preview released

Two variants are provided: V4‑Pro (1.6 T total parameters, 49 B active) and V4‑Flash (284 B total, 13 B active). Both variants natively support a 1 M‑token context window.

V4‑Pro (flagship)

Total parameters: 1.6 T (active 49 B)

Target: Open‑source performance ceiling against top closed‑source models

Key strengths: Agent capability, world knowledge, inference performance

V4‑Flash (lightweight)

Total parameters: 284 B (active 13 B)

Target: High efficiency, low cost, fast response

Key strengths: Inference quality close to Pro, cheaper API pricing, suitable for high‑frequency calls

Core technical breakthroughs

1. 1 M‑token context as default

Innovation: DSA sparse attention combined with token compression enables 1 M‑token context as a standard service.

Application examples: processing an entire technical manual, full project source code, or a collection of million‑word documents without segmentation.

Retrieval accuracy: 97 % on single‑pass processing of large texts.

Efficiency: V4‑Pro inference compute is 27 % of V3.2 and memory usage is 10 % of V3.2; V4‑Flash reduces compute to 10 % and memory to 7 % of V3.2.

2. Agent capability

Benchmark: Top position on the Agentic Coding leaderboard among open‑source models.

Internal testing: Outperforms Claude Sonnet 4.5 and approaches Opus 4.6 in non‑thinking mode.

Framework support: Compatible with Claude Code, OpenClaw, OpenCode, etc., delivering strong code generation and document‑processing.

Internal use: Adopted as DeepSeek’s primary agent programming model, improving development efficiency.

3. World knowledge and reasoning

World knowledge: Significantly ahead of peer open‑source models; gap to Gemini‑Pro‑3.1 described as minimal.

Math / STEM reasoning: Best open‑source performance on competition‑level coding and complex mathematical tasks, surpassing some closed‑source competitors.

MMLU baseline: Score > 84 %, placing the model in the first tier of industry performance.

Availability

Website: chat.deepseek.com – direct chat with 1 M‑token context.

Official app: Mobile client updated for V4.

API: Set model_name to deepseek-v4-pro or deepseek-v4-flash to invoke the respective model.

Open‑source weights: Published on Hugging Face with accompanying technical report for deployment and fine‑tuning.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AgentDeepSeekLarge Language Modelopen-source AIV4Sparse AttentionHuawei Ascend1M token context
AI Era Action Guide
Written by

AI Era Action Guide

Sharing AI action guides

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.