DeepSeek V4 Slashes Prices by 75% – Real‑World Claude Code Test with 4M Tokens
DeepSeek V4’s pricing fell 75% overnight, making the V4‑Pro and V4‑Flash models dramatically cheaper than competing AI services; the article details the new rates, compares them with other providers, shows two Claude Code case studies consuming nearly 4 million tokens, and explains how domestic Ascend 950 hardware enables the discount.
Price Shock: V4‑Pro Drops 75% in One Day
When DeepSeek released V4 on April 24, the V4‑Pro model was priced at 24 CNY per million output tokens, sparking complaints about its high cost. Within 24 hours the official site announced a 2.5‑fold discount, reducing the output price to 6 CNY – a 75% cut.
Special‑offer period ends at 23:59 on May 5, 2026.
After the discount, the cached‑input price is only 0.25 CNY per million tokens, essentially free, while the output price of 6 CNY is double the V3.2 rate (3 CNY) but still offers strong value because V4‑Pro achieved 80.6% on the SWE‑bench Verified benchmark, comparable to Claude Opus 4.6 (80.8%).
V4‑Flash: The Everyday Workhorse
V4‑Flash is positioned as a high‑cost‑performance model for daily use. Its pricing is unchanged from V3.2 for cached input (0.2 CNY) and drops 33% for output (2 CNY per million tokens), making it less than 1/50 of Claude Sonnet 4.6’s $15 per million output price.
Practical Case Study 1 – Code Audit with V4‑Pro
Using a multi‑agent stock‑analysis project, the author ran V4‑Pro in Claude Code to scan for security, functionality, and code‑quality issues. The audit identified five critical problems, including plaintext API keys, missing permission checks, a Redis deserialization vulnerability (via activateDefaultTyping), hard‑coded third‑party keys, and a UI bug.
These findings were then handed to GPT‑5.5 for verification and automatic fixing, illustrating the workflow of “cheap model finds bugs, expensive model validates and repairs.” The V4‑Pro audit cost was negligible thanks to the promotional pricing.
Practical Case Study 2 – Full Project Scan
The author ran V4‑Pro to analyze the entire project, producing a comprehensive report with high‑quality documentation. The two case studies together consumed 3,957,098 tokens (≈4 million), demonstrating the cost efficiency of the discounted rates.
Industry Price Landscape
While major Chinese cloud providers (Baidu Cloud, Tencent Cloud, Zhipu) raised AI‑related prices by 5‑30% in early 2026, DeepSeek uniquely reduced its rates. A side‑by‑side comparison shows:
DeepSeek V4‑Pro (promo): $0.44 input, $0.87 output per million tokens.
DeepSeek V4‑Flash: $0.14 input, $0.28 output.
Claude Sonnet 4.6/4.7: $3.00 input, $15.00 output.
Claude Opus 4.6/4.7: $5.00 input, $25.00 output.
GPT‑5.5: $5.00 input, $30.00 output (Pro version ≈ $30/$180).
Compared with GPT‑5.5 Pro, V4‑Pro’s input price is about 1/70 and output price about 1/200; against Claude Sonnet 4.6, output is roughly 1/17.
Why the Discount? Domestic Compute Power
DeepSeek attributes the price cut to the adoption of domestic Ascend 950 NPU hardware. V4 series fully supports Huawei Ascend NPU, and the company’s technical paper confirms a fine‑grained expert‑parallel (EP) scheme that runs on both NVIDIA GPUs and Ascend NPUs.
The official API notes that “due to limited high‑end compute, V4‑Pro’s throughput is constrained; after the bulk release of Ascend 950 supernodes later this year, Pro prices are expected to drop significantly.” The current 2.5‑fold discount is therefore seen as a market‑validation test pending broader domestic compute availability.
Practical Recommendations
For everyday dialogue, content generation, or simple Q&A, choose V4‑Flash for its ultra‑low cost and sufficient performance.
For agent‑driven coding or complex code‑refactoring, use V4‑Pro during the promotional period to leverage its high SWE‑bench score (80.6%) at minimal cost.
Note that the older model names deepseek-chat and deepseek-reasoner will be retired on July 24; V4‑Flash replaces them under the new name deepseek-v4-flash, requiring only a model‑name change.
Conclusion
DeepSeek’s aggressive price reduction stands out in a market where most providers are raising fees. Short‑term, the 2.5‑fold discount is limited to May 5, after which prices may rise, but the company signals further reductions once Ascend 950 hardware scales. In the meantime, V4‑Flash offers unmatched cost‑performance for routine tasks, while V4‑Pro (promo) is the optimal choice for high‑precision agent coding workloads.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JavaGuide
Backend tech guide and AI engineering practice covering fundamentals, databases, distributed systems, high concurrency, system design, plus AI agents and large-model engineering.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
