Inside Huawei Atlas 900A3 SuperPoD (CM384) Supernode Wiring Scheme
The article provides a detailed technical analysis of Huawei's Atlas 900A3 SuperPoD (CM384) supernode wiring architecture, covering cabinet composition, NPU/CPU counts, 400 G optical interconnect design, cable installation practices, scalability to 24 Pods, and the performance benefits for AI workloads.
Cluster composition : The Atlas 900A3 SuperPoD (also called CloudMatrix 384) consists of 16 cabinets, including 12 compute cabinets and 4 Lingqu‑bus cabinets. The system integrates 384 Ascend NPU chips and 192 Kunpeng CPUs across 48 compute nodes; each compute cabinet houses 4 nodes, each node equipped with 8 NPU.
Node optical interfaces : Every compute node provides dedicated optical ports—56 × 400 G ports for Lingqu‑bus interconnect and 8 × 400 G ports for the parameter‑plane network—ensuring full‑mesh connectivity among all NPU units.
Interconnect design : Huawei selects 400 G QSFP‑DD SR8 optical modules combined with MPO (Base‑16) cables as the core “blood vessels” of the cluster. Sixteen cabinets are linked via optical links, forming an all‑to‑all topology. Each compute cabinet uplinks 224 × 400 G ports to the bus, while 12 compute cabinets together use 2 688 × 400 G links to connect to 56 bus devices (14 × 4), which are further consolidated into seven independent network planes. This design guarantees uniform high‑speed, low‑latency communication for all 384 NPU.
Cable installation guidelines : Optical fibers are bundled in groups of 6‑10 and protected with sleeves; each protective bag holds 24 MPO connectors. Cables are routed first vertically, then horizontally, with careful length control to avoid redundancy and to keep fiber bundles above cable trays, maintaining clear spacing from downstream modules.
Scalability : The wiring scheme supports horizontal expansion to 24 Pods (totaling 9 216 NPU). When inter‑rack distances exceed 100 m, the 400 G SR8 modules can be replaced by FR4 modules paired with single‑mode LC fibers, extending the link reach up to 2 km while preserving bandwidth.
Performance implications : The 400 G optical backbone delivers triple optimization—high compute bandwidth, ample data throughput, and minimal latency—making the cluster well‑suited for large‑model AI training and inference. The flat network topology eliminates traditional multi‑layer bottlenecks, enhancing overall data‑center efficiency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
