DeepSeek V4 Slashes Prices by 75% – Real‑World Claude Code Demo with 4 Million Tokens

DeepSeek dramatically cut V4‑Pro and V4‑Flash pricing by 75%, offering sub‑dollar token rates that outperform competing models, and the article walks through detailed cost tables, industry price trends, hardware‑driven pricing rationale, and two hands‑on Claude Code case studies demonstrating code audit and full‑project scanning.

Java Tech Enthusiast
Java Tech Enthusiast
Java Tech Enthusiast
DeepSeek V4 Slashes Prices by 75% – Real‑World Claude Code Demo with 4 Million Tokens

Price Drop Details

Within 24 hours of the V4 launch, DeepSeek reduced the V4‑Pro per‑million‑token price from 24 CNY to 6 CNY (75% discount). The special‑offer ends on 2026‑05‑05 23:59.

Pricing (per million tokens)
----------------------------
Input (cache hit)   : 1 CNY → 0.25 CNY (75% off)
Input (cache miss)  : 12 CNY → 3 CNY (75% off)
Output               : 24 CNY → 6 CNY (75% off)

V4‑Flash Pricing

V4‑Flash is positioned as the low‑cost daily workhorse. Compared with the older V3.2 model, its input price stays at 0.2 CNY (no change) while the output price drops from 3 CNY to 2 CNY (‑50%).

V4‑Flash vs V3.2 (per million tokens)
------------------------------------
Input (cache hit)   : 0.2 CNY → 0.2 CNY (no change)
Input (cache miss)  : 2 CNY → 1 CNY (‑50%)
Output               : 3 CNY → 2 CNY (‑33%)

Industry Price Landscape

While most domestic cloud providers and foreign LLM vendors raised prices (5‑30% increases due to soaring AI‑compute demand), DeepSeek’s V4‑Pro and V4‑Flash remain far cheaper.

Model                Input ($/M)   Output ($/M)
------------------------------------------------
DeepSeek V4‑Pro*      0.44          0.87
DeepSeek V4‑Flash     0.14          0.28
Claude Sonnet 4.6/4.7  3.00         15.00
Claude Opus 4.6/4.7   5.00         25.00
GPT‑5.5               5.00         30.00
GPT‑5.5 Pro (avg)    ~30.00       ~180.00
*Special 2.5× discount price

Compared with Claude Sonnet, V4‑Flash’s output cost is less than 1/50, and V4‑Pro’s cost is roughly 1/17 of Claude Sonnet’s output price.

Performance Context

V4‑Pro achieved 80.6% on the SWE‑bench Verified benchmark (Claude Opus 4.6 scored 80.8%). Its Codeforces rating of 3206 placed it at the top of the leaderboard, giving the discounted price a strong cost‑performance justification.

Practical Experiments

Experiment 1 – Code Audit with V4‑Pro + GPT‑5.5 Review

A multi‑agent stock‑analysis project (≈4 M tokens) was scanned by V4‑Pro for security, functionality, and code‑quality issues. The audit surfaced five critical problems:

API Key stored in plaintext – encryption not applied.

Unrestricted system‑admin API – regular users could modify LLM settings.

Redis deserialization vulnerability – activateDefaultTyping allows arbitrary class instantiation.

Hard‑coded third‑party API keys – real keys committed in code.

Feature bug – “Re‑analyze” button on History page fails due to missing route parameters.

These findings were passed to GPT‑5.5 for verification and automatic remediation, demonstrating the workflow of “cheap model audits, expensive model fixes”. The total token consumption for the two runs was about 3,957,098 tokens (≈4 M).

Experiment 2 – Full Project Scan with V4‑Pro

The same stock‑analysis repository was scanned end‑to‑end. The final report, shown in the article’s screenshots, confirmed high‑quality, comprehensive analysis.

Why the Discount?

DeepSeek attributes the price cut to the rollout of domestic Huawei Ascend 950 NPU clusters. The model now fully supports both NVIDIA GPUs and Ascend NPUs, with a fine‑grained expert‑parallel (EP) scheme validated on both hardware types. The official API note states that once Ascend 950 nodes are mass‑produced later in the year, the price is expected to drop further.

Recommendations

Short‑term : Use the 2.5× discount before 2026‑05‑05; expect prices to rise afterward.

Long‑term : Anticipate continued price reductions as domestic compute capacity scales.

Model selection : Choose V4‑Flash for everyday dialogue and content generation; opt for V4‑Pro (discounted) for agent‑coding and complex tasks where SWE‑bench performance matters.

Note: deepseek-chat and deepseek-reasoner will be retired on 2026‑07‑24; migrate to deepseek-v4-flash by simply changing the model name.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DeepSeekClaude CodeChinese AI industryV4-FlashAI Model PricingV4-Pro
Java Tech Enthusiast
Written by

Java Tech Enthusiast

Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.