Cloud Computing 8 min read

How Alibaba Cloud’s AI‑Era Network Designs Redefine Data‑Center Architecture

The article traces the evolution of data‑center networking from general‑purpose cloud traffic to AI‑native designs—HPN for massive GPU scale‑out, UPN for super‑node scale‑up, and TPN for token‑centric inference—highlighting the shift from bandwidth to token throughput, latency, and cost.

Network Intelligence Research Center (NIRC)

May 13, 2026

How Alibaba Cloud’s AI‑Era Network Designs Redefine Data‑Center Architecture

At the recent CCF Xiu Lake conference, Alibaba Cloud vice‑president Cai Dezhi introduced the concept of a Token Performance Network, naming a data‑center architecture after the token metric that dominates large‑model inference workloads.

01 From General Cloud Network to AI Training Network

Traditional cloud networks carry mixed multi‑tenant traffic—web, database, storage, and micro‑service calls—that varies smoothly with business peaks. In contrast, AI training exhibits a bursty pattern: the compute phase is quiet, but during the communication phase all GPUs exchange data simultaneously, creating periodic traffic spikes.

This change forces network designs to prioritize handling sudden bursts, synchronization waits, and tail‑latency sensitivity rather than merely providing stable, long‑term bandwidth.

02 HPN: High‑Performance Network for Scale‑Out

To support training that scales from single‑machine multi‑GPU to multi‑machine, even ten‑thousand‑GPU clusters, Alibaba Cloud proposes HPN. Its key features are:

Support for roughly 15 K GPUs within a single pod.

Minimizing hop count and routing GPU‑to‑GPU traffic with as few detours as possible.

Dual‑plane architecture: if one plane congests or fails, the other continues to carry traffic.

Dual‑ToR design: a failed Top‑of‑Rack switch can be bypassed by the alternate ToR, reducing single‑point failures for large‑scale training.

HPN embodies a scale‑out mindset where the network is organized around the communication rhythm of large‑model training, synchronization, and fault sensitivity.

03 UPN: Ultra‑Performance Network for Scale‑Up

As AI workloads evolve, models such as MoE, long‑context inference, multimodal, and agent‑based systems increase internal communication frequency. This drives the need to connect many xPUs into a super‑node (Scale‑up). UPN addresses this by providing a high‑bandwidth, low‑latency fabric inside a super‑node, focusing on intra‑node xPU collaboration rather than inter‑rack connectivity.

04 TPN: Token‑Centric Network for the Inference Era

In inference, a token becomes both the basic unit of model output and a metric for AI service economics. TPN proposes a network that is aware of token‑level performance: token throughput determines how many useful results are produced per unit time, token latency affects user interaction smoothness, and per‑token cost governs the scalability of inference services.

Although detailed technical specifications are not yet public, TPN signals a shift from evaluating networks by raw bandwidth, latency, and loss to measuring their impact on token production.

05 Summary

Through the lenses of HPN, UPN, and TPN, the evolution path is clear: HPN represents scale‑out for massive GPU clusters, UPN represents scale‑up for super‑node xPU interconnects, and TPN introduces a token‑centric perspective for inference workloads. The focus moves from connecting servers, to connecting compute power, to optimizing token production efficiency—an advantage for any organization competing in AI infrastructure.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Network Architecture HPN AI data center token performance TPN UPN

Written by

Network Intelligence Research Center (NIRC)

NIRC is based on the National Key Laboratory of Network and Switching Technology at Beijing University of Posts and Telecommunications. It has built a technology matrix across four AI domains—intelligent cloud networking, natural language processing, computer vision, and machine learning systems—dedicated to solving real‑world problems, creating top‑tier systems, publishing high‑impact papers, and contributing significantly to the rapid advancement of China's network technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

01 From General Cloud Network to AI Training Network

02 HPN: High‑Performance Network for Scale‑Out

03 UPN: Ultra‑Performance Network for Scale‑Up

04 TPN: Token‑Centric Network for the Inference Era

05 Summary

Network Intelligence Research Center (NIRC)

How this landed with the community

Was this worth your time?

0 Comments

01 From General Cloud Network to AI Training Network

02 HPN: High‑Performance Network for Scale‑Out

03 UPN: Ultra‑Performance Network for Scale‑Up

04 TPN: Token‑Centric Network for the Inference Era

05 Summary