Cloud Computing 10 min read

Alibaba's High‑Performance Intelligent Data Center Network: Evolution, Programmable Forwarding, RDMA, Automation, and the Luoshen Cloud Network Engine

The article reviews Alibaba's large‑scale data‑center network advancements, covering its high‑performance evolution, programmable forwarding planes, massive RDMA deployment, automated control systems, AI‑driven self‑healing, and the Luoshen cloud network engine that underpins Alibaba Cloud services.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba's High‑Performance Intelligent Data Center Network: Evolution, Programmable Forwarding, RDMA, Automation, and the Luoshen Cloud Network Engine

Alibaba's network is a core component of its infrastructure, supporting diverse, complex, and rapidly growing business demands with stringent requirements for scale, performance, cost, stability, and intelligent operation.

Over the past five to six years, Alibaba's data‑center network has transformed from a typical enterprise network to a massive cloud‑scale network supporting 50‑100 k servers, petabyte‑level bandwidth, and a tightly integrated hardware‑software stack that leverages machine learning for automated, intelligent operations.

In 2018 at the Hangzhou Yunqi Future Network session, senior architects and researchers presented cutting‑edge technologies, including a keynote by Stanford professor Nick McKeown on programmable forwarding planes, emphasizing the shift from closed vendor systems to SDN and P4‑based programmable chips.

Alibaba pioneered large‑scale 25 Gbps and 100 Gbps deployments, with experimental 400 Gbps QSFP‑DD modules, positioning itself as a leader in high‑bandwidth data‑center networking.

The company’s RDMA strategy, led by senior expert Tang Lingbo, details the challenges and successes of deploying RDMA at Internet scale, enabling ultra‑low‑latency communication for AI, HPC, and distributed cloud workloads.

To manage millions of heterogeneous devices, Alibaba built an automated control system based on structured configuration and state description, abstracting vendor differences and providing a global, model‑driven network view that integrates SDN concepts.

Senior data expert Zhou Baofang demonstrated a self‑healing network framework that uses big‑data analytics and machine‑learning models to detect, locate, and automatically remediate faults, improving SLA for critical events such as Double‑11.

Finally, senior cloud engineer Sun Chenghao unveiled the Luoshen network engine, the virtual networking backbone for Alibaba Cloud that powers VPC, SLB, CEN, and over 100 cloud products, evolving through classic, VPC, global, and network‑less stages to simplify user experience.

Alibabacloud computingRDMANetwork Automationdata center networkProgrammable Forwarding
Alibaba Cloud Infrastructure
Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.