Cloud Computing 16 min read

Inside Alibaba Cloud’s Apsara: How Massive Scale and Open‑Source Drive Innovation

Alibaba Cloud’s chief architect Tang Hong recounts the company’s evolution from its 2009 launch, detailing the Apsara operating system’s milestones, massive scaling achievements, virtualization and container innovations, and future directions in lightweight virtualization, high‑speed hardware, and heterogeneous security, illustrating how open‑source collaboration fuels its growth.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Inside Alibaba Cloud’s Apsara: How Massive Scale and Open‑Source Drive Innovation

LC3 Conference and Alibaba Cloud Overview

At the inaugural domestic LC3 (LinuxCon + ContainerCon + CloudOpen) conference, Alibaba Cloud chief architect Tang Hong delivered a keynote that reviewed Alibaba Cloud’s history, major technical breakthroughs, current architecture, ecosystem, and future goals.

Key Historical Milestones

Alibaba Cloud was founded on 10 September 2009; the first product ECS launched on 28 July 2011. Revenue has grown triple‑digit YoY for eight quarters, reaching 66 billion CNY (≈ 10 billion USD) in 2017 with 870 k paying customers.

Apsara Operating System Evolution

The underlying cloud OS, named Apsara (飞天), began development in early 2009 and became Alibaba’s internal infrastructure in August 2010, supporting services such as search, email, image storage, and micro‑loan payments.

On 15 August 2013 the “5K” project broke the 5 000‑node barrier, making Alibaba the first Chinese company with a large‑scale general‑purpose compute cluster. In 2015 Alibaba Cloud sorted 100 TB of data in 377 seconds, setting a world record.

Community building efforts have attracted over 40 k developers to offline events and more than 7 million online viewers for the Cloud Xi conference.

Infrastructure Scale and Architecture

Alibaba Cloud now operates six mainland regions (three in North China, two in East China, one in South China) and eleven overseas regions, with over 600 PoP nodes and a total bandwidth of 20 TB/s.

The Apsara stack consists of four foundational modules (distributed coordination, security, logging, monitoring), two core management systems—Pangu (storage) and Fuxi (resource scheduling)—and a “Sky‑Base” layer for infrastructure and service management. Above this lie tenant management (authentication, authorization, billing), followed by compute, storage, database, network services, middleware, Serverless, AI/ML, and security services. The topmost “Cloud Marketplace” layer offers a PC‑like app store for cloud services.

Design Highlights

Apsara targets a universal compute platform for both latency‑sensitive and batch workloads, now managing over 10 000 nodes, hundreds of PB of storage, and 100 k CPU cores with > 99.95 % availability and triple‑replicated data achieving ten‑nine durability. Security follows a minimum‑trusted‑base principle.

5K Milestone and MaxCompute

The 5K capability enabled Alibaba to surpass Hadoop clusters internally, leading to the “Moon Landing” project that migrated core business workloads to Apsara. MaxCompute opened to the public on 1 July 2014, providing the world’s first public 5K‑scale service.

Before MaxCompute, the Tianchi competition attracted over 7 000 global teams, showcasing the platform’s data‑science capabilities.

Virtualization Advances

All Alibaba Cloud physical servers run Linux (Fedora, CentOS) and a custom AliKernel based on 2.6.32. Since 2010, Alibaba has contributed nearly 300 kernel patches to the upstream community.

Resource isolation techniques allow mixed latency‑sensitive and batch workloads with only ≤ 5 % performance loss while raising CPU utilization from 35 % to > 65 %.

Network isolation reduces average latency by 6.8× and tail latency by 11.8× compared to non‑isolated runs. IO throttling stabilizes IOPS at ~ 25 K for limited files.

Server virtualization migrated from Xen to KVM in 2014 and achieved Linux Foundation gold membership in 2017. Hypervisor hot‑upgrade technology enables full‑module upgrades (KMOD, QEMU) with millisecond‑level pauses, occurring on each VM roughly every 1–2 months.

Container Technology

Since October 2016 Alibaba partnered with Docker, launched DockHub in China, and joined the CNCF as a gold member in April 2017. It uniquely supports both Docker Swarm and Kubernetes.

Alibaba Cloud can deploy over 30 000 VM nodes in a single Docker cluster; on Singles’ Day 2016 more than 300 k containers were deployed, handling 175 k transactions per second.

Future Directions

Upcoming focus areas include lightweight virtualization for containers, adoption of ultra‑fast hardware such as NVMe storage and 25 GbE networking, and security enhancements for heterogeneous accelerators (FPGA, GPU, custom ASICs).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Virtualizationlarge-scale systemsAlibaba CloudContainer TechnologyApsara
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.