Inside Huawei’s Ascend 910C AI Chip: Architecture, Performance Gaps & Strategy

This article translates and expands on analyst Lennart Heim’s X‑platform report, dissecting Huawei’s newly mass‑produced Ascend 910C AI accelerator, its dual‑chip packaging, performance estimates versus NVIDIA’s H100 and upcoming B200, supply‑chain origins, potential domestic production, and the broader strategic impact on China’s AI competitiveness.

Open Source Linux
Open Source Linux
Open Source Linux
Inside Huawei’s Ascend 910C AI Chip: Architecture, Performance Gaps & Strategy

This article is a translation and professional interpretation of analyst Lennart Heim’s analysis posted on the X platform.

The mass‑production announcement of Huawei’s Ascend 910C AI accelerator, dubbed China’s strongest AI chip, has sparked intense interest as a symbol of China’s resilience in high‑tech development.

Technical composition of Ascend 910C: clever dual‑chip combination, intriguing architecture

Not a brand‑new architecture, but a clever reuse?

The “C” in Ascend 910C does not denote a critical change; rather, it reflects a clever combination of two existing Ascend 910B chips integrated through advanced packaging, effectively a “fusion” of two dies.

This approach leverages mature process technology, avoiding costly breakthroughs in unknown nodes while achieving performance gains through architectural innovation.

Heim suggests that the chips may have been sourced from TSMC before tighter export controls, implying that Huawei stocked the dies in advance.

Packaging trade‑off: balancing performance and cost

Packaging decisions and their impact

Packaging is a critical factor influencing chip performance, power consumption, and cost, especially for AI accelerators where advanced packaging can provide a competitive edge, as demonstrated by NVIDIA.

Ascend 910C opts for a more mature, lower‑complexity solution: two separate 910B dies placed on individual silicon interposers and connected via an organic substrate.

This packaging results in inter‑die bandwidth that is estimated to be 10‑20 times lower than NVIDIA’s CoWoS or Foveros solutions, creating a notable performance bottleneck.

Performance and specifications: gap with H100 and chase towards B200

Objective assessment: 80% performance claim

Heim estimates the 910C can deliver roughly 800 TFLOPS of FP16 compute and about 3.2 TB/s memory bandwidth, which is approximately 80 % of NVIDIA’s 2022 H100.

The chip’s logical area is about 60 % larger than the H100, indicating lower architectural efficiency.

Generational gap: facing B200, challenge escalates

Compared with NVIDIA’s upcoming B200 series, the 910C lags significantly in key metrics.

Compute performance: roughly three times lower.

Memory bandwidth: about 2.5 times lower, even assuming HBM2E.

Energy efficiency: noticeably behind the B200.

By 2025, Western AI chip production is projected to be at least five times the volume of China’s, with overall compute capacity 10‑20 times greater.

Supply chain and production: mysterious origins and domestic potential

“TSMC stash?” – shocking supply‑chain speculation

Heim hypothesizes that Huawei may have stockpiled up to three million 7 nm Ascend dies from TSMC before export controls tightened.

He also suggests Huawei could have secured a large amount of HBM2E memory, potentially enabling the production of around 1.4 million 910C accelerators, equivalent to the AI compute of one million NVIDIA H100‑class chips.

Domestic 7 nm production?

Heim believes Huawei likely has the capability to produce 910B and 910C dies at the 7 nm node, but large‑scale mass production faces challenges in yield, cost, and stability.

Strategic significance and global AI competition

Performance gap exists, strategic importance not to be underestimated

Despite the performance gap, the 910C’s launch carries symbolic weight, demonstrating China’s determination to close the AI chip gap under export‑control pressure.

China’s ability to concentrate resources could offset raw compute disadvantages, allowing focused advances in AI inference for specific industries such as smart cities, transportation, manufacturing, and security.

Inference first, application breakthrough?

Prioritizing AI inference over massive pre‑training may enable China to achieve commercial leadership in targeted sectors, even if overall compute capacity lags.

However, next‑generation pre‑training will still require massive clusters of tens of thousands of chips, underscoring the continued importance of total compute volume.

Conclusion and outlook

While the Ascend 910C may only reach about 80 % of H100 performance and its supply‑chain origins remain opaque, its strategic significance is substantial, marking a milestone in China’s AI‑chip autonomy and hinting at a “differentiated‑competition” path.

Future coverage will track the chip’s real‑world deployments, its impact on China’s AI ecosystem, and evolving strategies to build resilient, domestically‑controlled AI infrastructure.

Performance AnalysisAI strategyAI chipHuaweiAscend 910Csemiconductor supply chain
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.