Cloud Native 24 min read

How Alibaba Scaled to 100% Containerization with PouchContainer: A Cloud‑Native Journey

Alibaba achieved full internal containerization by evolving from monolithic apps to the T4 container, integrating Docker images, and open‑sourcing PouchContainer, detailing the architectural shifts, resource isolation requirements, large‑scale deployment strategies, and lessons for building cloud‑native platforms.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How Alibaba Scaled to 100% Containerization with PouchContainer: A Cloud‑Native Journey

Background

Alibaba Group migrated all production workloads to a container‑based, image‑driven model and released the open‑source container engine PouchContainer 1.0 GA. The engine runs on kernels as old as 2.6.32 and powers millions of instances across transaction, middleware, e‑commerce, search, advertising and big‑data services.

From Monolith to Distributed Services

Starting in 2008 the Taobao monolith was split into independent services (product, transaction, user, front‑end, back‑end) using the HSF RPC framework, TDDL distributed database and Notify messaging. Service count grew while per‑instance resource demand fell, prompting a move from bare‑metal to Xen/KVM VMs. VM utilization remained low (e.g., a 24‑core host could host only four 4‑core VMs), leading to the exploration of process‑level isolation.

Four Core Isolation Requirements

Each instance must have an independent IP address and SSH access.

Each instance must have an isolated filesystem.

CPU and memory usage must be isolated and visible only to the owning instance.

The isolation experience must be indistinguishable from a physical machine or VM.

These constraints ensured a seamless migration from VMs to containers without breaking existing operational tooling.

First‑Generation Container – T4

Initial prototypes hacked kernel and glibc. Later the open‑source LXC project was adopted and extended with custom kernel patches for resource‑visibility isolation and directory‑based disk‑quota enforcement. The resulting product, named T4 (Taobao 4.0), launched in 2011. T4 removed the hypervisor layer, allowed flexible over‑commit, and was compatible with existing deployment pipelines, enabling transparent migration of millions of services.

Adoption of Docker Image Mechanism (2015)

In 2015 Docker introduced a layered image format. Alibaba integrated Docker images, replacing the previous “baseline” package model. Developers now author Dockerfile s that describe the full dependency stack; ops focus on reliable image build, storage and distribution. This shift moved environment definition into source control and simplified the build‑deploy workflow.

Operational Automation

After image adoption the platform emphasized:

Declarative configuration.

Automatic container restart on failure.

Live migration of containers across hosts.

Self‑healing mechanisms aimed at unattended operation.

Key Technical Features of PouchContainer

Isolation via Linux namespaces, cgroups and, originally, custom kernel patches (later replaced by lxcfs).

Disk‑space isolation using directory‑based quotas on older kernels and overlay2 on kernels ≥ 4.9.

Rich container mode with built‑in monitoring and management tools.

Scalable deployment supporting millions of instances.

Kernel compatibility down to 2.6.32.

Full Docker Engine API, CRI support and integration with Kubernetes, Docker Swarm and Alibaba’s Sigma scheduler.

Two‑Tier Image Distribution Architecture

To avoid overloading a central registry when updating tens of thousands of hosts, Alibaba built a hierarchical distribution system:

Each geographic region hosts a local mirror.

Within a region, nodes exchange image layers via a peer‑to‑peer protocol (named “Qingting”).

This design dramatically reduces bandwidth consumption on the central registry.

Open‑Source Release

The source code was published on 19 Nov 2017 at https://github.com/alibaba/pouch. The repository implements Docker APIs, CRI, and supports multiple runtimes, including RunLXC for legacy kernels. Contribution guidelines are available at https://github.com/alibaba/pouch/blob/master/CONTRIBUTING.md. Over 2,300 commits from more than 80 contributors shape the project.

Technical Lessons

Define isolation requirements that match existing operational assumptions (IP, SSH, filesystem, resource visibility).

Leverage open‑source projects (LXC, Docker) and extend them only where necessary.

Gradually refactor the deployment pipeline to adopt image‑based delivery.

Automate fault handling to move toward self‑healing, unattended operation.

Design scalable image distribution (regional mirrors + P2P) for massive fleets.

Selected Technical Q&A

Q: How is persistent data handled inside containers?

A: Logs remain on the local host. Application data can be stored on local disks or on Alibaba’s distributed storage system “Pangu”. Pangu provides block devices that are attached to containers at launch, allowing data to reside remotely and be migrated independently of the container host.

Q: Is the PouchContainer image registry compatible with Docker?

A: Yes. PouchContainer’s registry fully supports the Docker image format and can interoperate with Docker Engine, Kubernetes and other Docker‑compatible tools.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Dockercloud-nativeKubernetesPouchContainer
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.