Databases 14 min read

Inside Alibaba’s Database Hacks That Quadrupled Double‑11 Query Speed

The article details how Alibaba’s database team engineered multiple breakthroughs—PK_Access query optimization, a high‑performance Memcached plugin, batch‑processed inventory hotspot handling, full‑scale SQL collection, and an upgraded Data Replication Center—to dramatically boost Double‑11 transaction throughput, cut latency, and enhance overall system resilience.

Alibaba Cloud Developer

Jan 12, 2017

Inside Alibaba’s Database Hacks That Quadrupled Double‑11 Query Speed

Preface

In 2016 Double 11, the massive e‑commerce traffic put Alibaba’s group databases to the test, revealing the database team’s relentless technical pursuit behind the flawless performance.

Transaction Query Optimization

During the 2014 Double 11 event, transaction‑level AliSQL queries already reached an enormous scale. To ensure stability for the next year, the team spent 2015 analyzing business characteristics and each SQL, ultimately injecting the PK_Access optimization into the optimizer.

When Double 11 arrived in 2015, PK_Access improved query capability by 27% and reduced response time by 48%.

Beyond the optimization, extensive effort was made to define interface flow‑control thresholds, resulting in 125 external throttling plans for the 2015 transaction system.

To further reduce parsing overhead, the team revived the rarely used MySQL InnoDB Memcached plugin.

After fixing 15 bugs and adding six new features, the plugin achieved up to 420 k QPS compared with 120 k for pure SQL, a near‑four‑fold increase; single‑query response time dropped about 30%, and CPU load fell from 72% to 45% under the same traffic.

In the 2016 Double 11 event, transaction‑level second‑level queries surpassed ten million per second, marking the first time the system broke the ten‑million‑per‑second barrier.

Inventory Center Deduction Capability

Hotspot inventory has long been a pain point for flash‑sale events. Since 2013 the team has been improving single‑item deduction speed, raising it from 500 rows/s in native MySQL to 5 000 rows/s by 2015.

In 2016, a batch‑processing patch called “hotspot monster” introduced two pipeline submissions with group commit, boosting single‑item hotspot deduction to 100 k rows/s—over 20× improvement—and adding automatic hotspot detection without affecting non‑hot items.

The upgraded inventory system ran smoothly during the 2016 Double 11, effectively eliminating hotspot issues for future events.

Database Platform Upgrade

The database’s core consists of data and SQL; both are indispensable.

In early 2011, the first‑generation monitoring system Tianji was built, featuring the MyAWR SQL collection tool that captured 10 000 network packets via TCPDUMP, parsed them into SQL statements, and calculated per‑minute SQL counts.

This method suffered from accuracy limitations and was used until 2013.

From 2014 to 2015 the team developed a MySQL plugin called DAM (DataBase Activity Monitor) for event logging and security auditing, but it was eventually abandoned.

In 2016 the team set a new goal: full‑scale SQL collection even during Double 11, capturing not only the SQL text but also execution time, scanned rows, and other key metrics, requiring INFO‑level logging under a load of 10 k requests per second.

Through intensive collaboration, a second‑generation full‑SQL collection system was delivered.

On Double 11 2016, the full‑SQL output impacted the database by less than 5%, while real‑time processing reached tens of millions of rows per second, with average latency under 100 ms. Collected data is stored in SLS/ODPS and analyzed via AliRocks.

CloudDBA now leverages this data for offline analysis, real‑time performance monitoring, and future machine‑learning‑driven diagnostics and resource forecasting.

The Data Replication Center (DRC) also saw major upgrades: supporting tens of thousands of sync processes per cluster, halving API response time, accelerating failover with Akka, merging regions for load balancing, and maintaining sub‑500 ms end‑to‑end sync latency under peak load.

DRC added row‑level verification, encrypted transmission, automatic migration handling, and full SQL query capability for downstream users, enabling rapid data quality checks, reconciliation, and lightweight incremental reporting.

Conclusion

These examples illustrate only a fraction of the Alibaba database team’s optimizations for Double 11; the team continues to pursue extreme performance, embrace new challenges, and deliver innovative database solutions for the future.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Alibaba Performance Tuning High concurrency database optimization SQL Monitoring

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.