May 4, 2026 · Artificial Intelligence

DeepSeek‑V4 Inference Cost Showdown: NVIDIA H100 vs Ascend 950PR vs 910C

DeepSeek‑V4, a 1.6‑trillion‑parameter MoE model with mixed‑precision attention, is benchmarked on three accelerators—NVIDIA H100, Huawei Ascend 910C, and Ascend 950PR—showing that the 950PR delivers the lowest per‑token cost in both Prefill and Decode phases, while the H100 offers the highest raw performance at a far greater price.

DeepSeek V4FP8Huawei Ascend 950PR

0 likes · 8 min read

DeepSeek‑V4 Inference Cost Showdown: NVIDIA H100 vs Ascend 950PR vs 910C

Lao Guo's Learning Space

Apr 27, 2026 · Artificial Intelligence

DeepSeek V4 & Huawei Ascend 950PR: Is Domestic Compute Ready for Enterprise AI?

DeepSeek V4, paired with Huawei’s Ascend 950PR chip, delivers inference speed up to 2.87× that of Nvidia H20 and introduces a CSA+HCA attention compression that cuts KV cache usage to under 10%, but its 94‑96% hallucination rate and high token consumption raise concerns for production use.

AI inferenceCSA+HCADeepSeek V4

0 likes · 13 min read

DeepSeek V4 & Huawei Ascend 950PR: Is Domestic Compute Ready for Enterprise AI?