How Xiaomi Scaled Its E‑Commerce Platform: From Monolith to Cloud‑Native Architecture
This article chronicles Xiaomi's e‑commerce platform evolution, detailing the shift from a simple monolithic design to a modular, sharded, and cloud‑native architecture that leverages async messaging, horizontal database partitioning, flash‑sale systems, dual‑data‑center caching, and sophisticated monitoring to handle massive traffic spikes.
Overview
The first generation of Xiaomi's website had a very simple architecture, as shown in Figure 1.
It implemented basic e‑commerce components: an online sales system, order processing, and warehousing/logistics (integrated with a partner). All business systems shared a single database, which soon became a bottleneck as traffic surged during new product launches.
Changes and Iterations
In early 2012, after six months of operation, the team split the business systems, first extracting the sales system and then progressively separating other subsystems. Each subsystem obtained its own database, eliminating resource contention and clarifying module boundaries (Figure 2).
As subsystems grew, the number of inter‑service interfaces exploded (Figure 3), making maintenance increasingly difficult.
To decouple services, Xiaomi built an asynchronous message service (Notify) that acted as a central broker, turning the mesh of calls into a star topology and greatly reducing communication overhead (Figure 4).
The upgraded architecture adopted a three‑layer model: a scheduling layer (LVS, HAProxy) for traffic forwarding and failover, a heterogeneous business layer (multiple languages and frameworks), and a data layer (MySQL, NoSQL, Redis, Memcache).
When traffic continued to grow, especially during flash‑sale events, the team introduced the open‑source Cobar middleware to horizontally shard the database across 32 instances, each with dual‑master high availability (Figure 5).
For massive purchase spikes, Xiaomi built a large‑scale flash‑sale system called BigTap, which works like a bank ticket‑issuing machine: it validates users, grants purchase eligibility, and processes successful or failed purchases (Figure 6).
BigTap was deployed on AWS, scaled up a day before a sale and torn down afterward, achieving cost‑effective elasticity.
Beyond flash sales, Xiaomi created a high‑performance cache service (MCC) based on Redis and Twemproxy, handling up to 140 k QPS per node and operating in a dual‑data‑center active‑active setup (Figure 7).
In normal operation both data centers serve reads and writes; if one center fails, the other takes over by promoting its replica, ensuring continuity though with reduced redundancy.
The inventory system was redesigned as a virtual allocation platform that aggregates multiple warehouses into flexible channels, improving turnover while accepting cross‑warehouse shipments (Figure 8).
Cross‑warehouse transfers are modeled as a multi‑objective linear programming problem that considers current and forecast demand, routes, and timing.
Monitoring is critical; Xiaomi built an effective alert system that distinguishes between anomalies and alerts, using functions such as val(), count(), exist(), and expressions like val()>3 for anomaly detection, and times()>3 && limit(0) && snooze(30) for throttled alerting.
Summary
Currently, Xiaomi is migrating to a service‑oriented architecture built on Thrift, ETCD, Go, and PHP, with a custom SOA framework written in Go and lightweight adapters for non‑Go services, positioning service‑orientation as the future direction for its expanding technical ecosystem.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
