Databases 14 min read

How Alibaba Cloud RDS Powered Double 11: Cloud‑Native Innovations & 30% Cost Cut

Alibaba Cloud's RDS team detailed how cloud‑native transformations, multi‑active deployments, ARM localization, and advanced kernel optimizations enabled the 2021 Double 11 shopping festival to handle massive traffic, cut costs by over 30%, and achieve greener, highly available database services.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba Cloud RDS Powered Double 11: Cloud‑Native Innovations & 30% Cost Cut

Double 11 Review

Alibaba Cloud Database has continuously supported Tmall Double 11, and this year it achieved a comprehensive cloud‑native transformation that greatly improved user experience, driving technology‑enabled consumer value and leading technical development.

The promotion ended successfully, and the "Good Technology New Starting Point – 2021 Double 11 Alibaba Cloud Database Technical Secrets" series will share the behind‑the‑scenes stories.

RDS Group Business Support

Since 2019 the group’s transaction‑related RDS has reached 100% cloud adoption, with new regions in Shenzhen, Germany, and US East supporting global operations. With abundant resources, one‑click cloud migration success reaches 99%.

Multi‑Active Deployment During Promotion

In the weeks before Double 11, the RDS team rapidly built new promotion units within three days, completing site construction, scaling, parameter tuning, and consistency checks to ensure stable traffic splitting.

Geographically distributed active‑active clusters allow each unit to handle transactions independently, sharing write capacity and balancing peak loads.

Core transaction clusters consist of two independent three‑node clusters replicated via DTS; during promotion a third three‑node cluster is added to further distribute read/write traffic. DTS Store and Writer prevent primary‑key conflicts by using different step sizes and Thread ID based replication.

Global Read‑Only and Disaster Recovery

RDS provides global read‑only instances and remote read‑only learners that sync data without participating in leader election, using Xcluster native replication for consistency. Each read‑only instance offers two AZ replicas for RPO=0 high availability and cross‑region disaster recovery.

Kernel Xcluster Multi‑Point Write

The Xcluster kernel adds multi‑point write capability for inventory workloads, allowing primary and standby to write simultaneously without lock conflicts, effectively utilizing standby resources and improving read/write performance. It implements group‑level one‑way replication similar to MySQL channel replication, providing independent data channels and high availability.

ARM Localization Pilot

In 2020 the ARM localization used MySQL + Ext4; in 2021 performance breakthroughs were achieved by moving MySQL from kernel‑space to user‑space, directly interfacing with the POSIX file system to improve efficiency. ARM nodes now handle replication traffic in production.

Green Low‑Carbon Initiative

By leveraging technology, the Zhangbei promotion unit reduced cost by over 30% while maintaining stability and achieving green, low‑carbon operation.

RDS Resource Scheduling: Exclusive‑Shared Mixed Deployment

A new exclusive‑shared mixed deployment merges core cluster instances with long‑tail instances, saving 45,000 CPU cores and cutting overall promotion cost by 34.5%. CgroupController tracks exclusive and shared CPU cores, dynamically adjusting shared pod allocations as exclusive pods change.

Business‑Attribute Anti‑Affinity Flexible Scheduling

RDS introduces anti‑affinity tags allowing custom business‑driven placement, such as exclusive allocation for transaction databases and dual‑instance deployment for inventory, meeting diverse scheduling requirements.

RDS Kernel Features

How to Achieve RPO=0?

RDS uses a three‑node enterprise edition (Leader, Follower, Logger) with Paxos protocol to guarantee zero data loss. Leader commits transactions after majority acknowledgment; on leader crash, a new leader rebuilds logs from local and peer logs; Logger stores only binlog and minimal metadata, reducing cost.

How to Track and Throttle Slow Queries?

During Double 11, RDS optimized multi‑point write, Statement Queue, and slow‑SQL throttling. Sudden bursts of high‑resource queries can exhaust thread‑pool resources, degrading performance. RDS uses a thread‑pool model where each query first acquires a thread; slow queries consume CPU and thread slots, starving other queries.

RDS implements Statement Digest based slow‑query detection and throttling, upgrading all instances to a new kernel version covering 100% of instances.

Leader sends transaction to followers and commits after majority.

On leader crash, new leader replays logs from local and peers.

Logger stores only binlog and basic dictionary info, saving cost.

Slow Query Blocker consists of pattern‑matching and detection modules. Matching queries that exceed concurrency thresholds trigger predefined throttling actions. Detection records slow queries, updates the pattern list, and applies custom throttling policies.

After enabling SQB throttling, normal SQL continues unaffected while slow queries no longer exhaust thread‑pool resources, improving stability.

Future Outlook

All major cloud providers are advancing RDS IPv6; Alibaba Cloud will adopt dual‑stack mode for some group services. RDS localization trials are ongoing, and resource scheduling will continue evolving toward larger mixed‑deployment models to further reduce promotion costs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance Optimizationdatabasegreen computingRDS
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.