Overview of Huawei Kunpeng 920 Processor Architecture and Subsystems
The article provides a detailed technical overview of Huawei's Kunpeng 920 processor, describing its ARM‑based RISC architecture, chip organization, core and cluster layout, security features, IMU management, and the various subsystems such as IO, interrupt, network, SAS, and PCIe.
1. Organization of Kunpeng Processor
Chip: a silicon die with large‑scale integration, the common form of CPU.
DIE: the smallest physical unit of a chip; Kunpeng 920 packages three DIEs, two for compute and one for IO.
Die (晶粒): an un‑packaged semiconductor block that becomes an individual integrated circuit after wafer processing.
Core: the actual compute unit, seen as a "core" by the operating system.
Cluster: a group of cores; Kunpeng 920 groups four cores into one cluster, with eight clusters per DIE.
SoC (System on Chip): integrates CPU, RoCE NIC, SAS controller, southbridge, etc., forming a complete system on a single chip.
2. Kunpeng 920 Chip Architecture
One SoC contains three DIEs: two compute DIEs and one IO DIE.
Each compute DIE has 8 clusters; each cluster contains 4 cores, resulting in 64 cores per Kunpeng 920 chip.
Each core in the compute DIE has private L1 and L2 caches, while all cores share an L3 cache.
The IO DIE integrates network and PCIe modules, and the DIEs are interconnected via a high‑speed internal bus.
3. System Security & IMU
Security: supports Secure Boot and Trusted Execution Environment using ARM TrustZone combined with hardware mechanisms.
IMU (Intelligent Management Unit) is an on‑chip management unit that works with BMC to provide data‑center node monitoring, fault pre‑processing, trust root, energy management, and other management functions.
4. Other Subsystems of Kunpeng 920
The processor includes compute, storage, device IO, interrupt, and virtualization subsystems.
Kunpeng 920 contains two CPU DIEs, one IO DIE, and eight DDR4 channels, interconnected by an AMBA bus.
5. IO Subsystem
The IO DIE extends the processor with on‑chip accelerators such as 100 GbE NICs and SAS controllers, and supports PCIe 4.0 devices like NICs and GPUs.
High‑speed devices on the SoC are also PCIe‑based and can be configured via PCIe configuration space.
Subsystems (PCIe, CCIX, Hydra, Network, Storage, HAC, ME) follow industry standards and open‑source compatibility requirements.
6. Interrupt Subsystem
Implements line and message interrupts compatible with ARM GIC specifications.
GIC (Generic Interrupt Controller) provides enable/disable, routing, priority configuration, and AArch64 security/virtualization extensions.
Supports SGI, PPI, SPI, and LPI interrupts.
Allows routing of interrupts to any CPU core.
Offers interrupt priority settings.
Includes AArch64 security and virtualization extensions.
GICv3 introduces message‑based interrupts (LPI) with support via ITS (Interrupt Translation Service) for dynamic routing.
Kunpeng also adopts the MBIGEN (Message‑Based Interrupt Generator) technology.
7. Network Subsystem
Consists of Network ICL and RoCE engine.
Network ICL provides multiple 1 Gbps‑100 Gbps Ethernet controllers, DCB, MAC tables, VLAN filtering, flow tables, and PCIe integration.
RoCE (RDMA over Converged Ethernet) offers low‑latency, low‑CPU‑utilization remote memory access, based on InfiniBand v2.
8. SAS Subsystem
Provides two X8 SAS 3.0 controllers, supporting SAS 2.0/1.0 and SATA 3.0/2.0/1.0.
SAS supports 12 G/6 G/3 G/1.5 G rates; SATA supports 6 G/3 G/1.5 G with auto‑negotiation.
Directly connects up to eight SAS or SATA drives, with optional expander for more disks.
Direct connection: PHY of SAS controller connects straight to device.
Expander connection: devices connect via an expander.
Also includes NOR flash (4 chip‑selects, up to 512 KB), SPI flash (2 chip‑selects, up to 32 MB), and NAND flash (4 chip‑selects).
9. PCIe Subsystem
Supports PCIe GEN1/2/3/4.0, up to 40 lanes, with three PCIe cores (Core0: 16 lanes, Core1: 16 lanes, Core2: 8 lanes).
Each core can act as a Root Port; only Core1 can function as an Endpoint.
Features embedded DMA engines.
Supports SRIS, SR‑IOV, shared virtual memory, CCIX, and Peer‑to‑Peer traffic.
Source: Huawei Cloud BBS
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.