Cloud Native 17 min read

How Alibaba’s AServer Gateway Evolved to a Cloud‑Native Architecture

Alibaba’s AServer access gateway, handling billions of users and millions of QPS, transitioned from a monolithic tengine‑based system to a cloud‑native, containerized architecture with Kubernetes, Pilot, and Envoy, improving operational complexity, dynamic routing, traffic isolation, and scalability for massive e‑commerce traffic.

Alibaba Terminal Technology
Alibaba Terminal Technology
Alibaba Terminal Technology
How Alibaba’s AServer Gateway Evolved to a Cloud‑Native Architecture

Architecture Evolution Background

Every year the Double‑11 shopping festival tests Alibaba’s services, especially the AServer access gateway, which must withstand traffic surges, cleanse attack traffic, and support a massive cluster size.

Business Background

Initially the gateway used a tengine gateway that forwarded based on domain names. With the All‑in‑Wireless era, Alibaba built the MTOP (Mobile Taobao Open Platform) API gateway to provide a unified API platform for client and server, routing based on URI. Over the years the number of routing rules grew to tens of thousands, requiring fine‑grained control based on request parameters and headers.

Business model diagram
Business model diagram

Operations System Background

The early infrastructure was simple: the gateway ran on physical machines with processes and configuration files. As traffic grew, configuration management became a bottleneck, leading to a custom configuration platform that split configuration into application, public, and certificate parts.

Public configuration: generated basic tengine runtime configuration via Git version control.

Application configuration: generated business‑specific tengine configuration from templates.

Certificate configuration: managed certificate lifecycle and automatic renewal.

Initial system deployment architecture:

Initial system deployment architecture
Initial system deployment architecture

The physical‑machine deployment allowed rapid iteration but introduced operational complexity, inconsistency, and high risk of manual errors.

Core Issues

Operational complexity: special hardware requirements, intricate configuration management, and poor integration with Alibaba’s overall operations system.

Lack of dynamic orchestration: static routing policies could not meet the real‑time flexibility demanded by business.

High cost of traffic isolation: creating separate clusters for isolation was expensive.

Operations System Upgrade

Containerization solved release and environment‑consistency problems. By packaging immutable binaries and configurations into containers and keeping variable configurations in the custom platform, the gateway could be deployed via Kubernetes pods.

Containerized release and configuration flow
Containerized release and configuration flow

The containerized architecture simplified site creation, scaling, and added an approval workflow that could be linked to monitoring for automatic alerts.

Service Governance & Gateway Mesh

To handle tens of thousands of API routing rules and increasingly fine‑grained policies, the legacy tengine approach became insufficient. A custom tengine module was costly to maintain.

Legacy routing architecture
Legacy routing architecture

After evaluating solutions such as Kong and Ambassador, Envoy was chosen for its dynamic configuration (xDS), extensibility, and performance.

North‑South Split

The gateway serves both user‑side long‑connection keep‑alive (north‑bound) and business‑side routing, isolation, and security (south‑bound). Splitting these functions into separate clusters allows a small, high‑performance north‑bound cluster to act as a “dam” for traffic spikes, while the south‑bound cluster focuses on routing and security, improving overall resource utilization.

North‑south split architecture
North‑south split architecture

Overall Architecture

The three‑stage evolution results in a final architecture consisting of:

Unified control plane for service registration, discovery, and circuit‑breaker policies.

North‑bound layer based on tengine to handle billions of users and traffic spikes.

South‑bound routing layer built on Envoy with Pilot converting policies to xDS for dynamic routing and lightweight traffic isolation.

Cloud‑native foundation built on Alibaba’s ASI platform, abstracting gateway differences and reducing operational complexity.

AServer cloud‑native architecture diagram
AServer cloud‑native architecture diagram

Future

Each evolution step addressed long‑standing problems, but cloud‑native transformation is not the end point. The new engine enables further product innovation, allowing developers to benefit from a seamless, invisible gateway experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeOperationsScalabilityKubernetesService Meshgateway
Alibaba Terminal Technology
Written by

Alibaba Terminal Technology

Official public account of Alibaba Terminal

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.