How Xiaomi Scaled Its E‑Commerce Platform: From Monolith to Cloud‑Native Architecture

This article chronicles Xiaomi's e‑commerce platform evolution, detailing the shift from a simple monolithic design to a modular, sharded, and cloud‑native architecture that leverages async messaging, horizontal database partitioning, flash‑sale systems, dual‑data‑center caching, and sophisticated monitoring to handle massive traffic spikes.

21CTO
21CTO
21CTO
How Xiaomi Scaled Its E‑Commerce Platform: From Monolith to Cloud‑Native Architecture
Overview

The first generation of Xiaomi's website had a very simple architecture, as shown in Figure 1.

Figure 1: First‑generation architecture
Figure 1: First‑generation architecture

It implemented basic e‑commerce components: an online sales system, order processing, and warehousing/logistics (integrated with a partner). All business systems shared a single database, which soon became a bottleneck as traffic surged during new product launches.

Changes and Iterations

In early 2012, after six months of operation, the team split the business systems, first extracting the sales system and then progressively separating other subsystems. Each subsystem obtained its own database, eliminating resource contention and clarifying module boundaries (Figure 2).

Figure 2: Split business systems
Figure 2: Split business systems

As subsystems grew, the number of inter‑service interfaces exploded (Figure 3), making maintenance increasingly difficult.

Figure 3: Interface call graph
Figure 3: Interface call graph

To decouple services, Xiaomi built an asynchronous message service (Notify) that acted as a central broker, turning the mesh of calls into a star topology and greatly reducing communication overhead (Figure 4).

Figure 4: Notify asynchronous messaging system
Figure 4: Notify asynchronous messaging system

The upgraded architecture adopted a three‑layer model: a scheduling layer (LVS, HAProxy) for traffic forwarding and failover, a heterogeneous business layer (multiple languages and frameworks), and a data layer (MySQL, NoSQL, Redis, Memcache).

When traffic continued to grow, especially during flash‑sale events, the team introduced the open‑source Cobar middleware to horizontally shard the database across 32 instances, each with dual‑master high availability (Figure 5).

Figure 5: Cobar database sharding
Figure 5: Cobar database sharding

For massive purchase spikes, Xiaomi built a large‑scale flash‑sale system called BigTap, which works like a bank ticket‑issuing machine: it validates users, grants purchase eligibility, and processes successful or failed purchases (Figure 6).

Figure 6: BigTap flash‑sale system
Figure 6: BigTap flash‑sale system

BigTap was deployed on AWS, scaled up a day before a sale and torn down afterward, achieving cost‑effective elasticity.

Beyond flash sales, Xiaomi created a high‑performance cache service (MCC) based on Redis and Twemproxy, handling up to 140 k QPS per node and operating in a dual‑data‑center active‑active setup (Figure 7).

Figure 7: Dual‑data‑center cache architecture
Figure 7: Dual‑data‑center cache architecture

In normal operation both data centers serve reads and writes; if one center fails, the other takes over by promoting its replica, ensuring continuity though with reduced redundancy.

The inventory system was redesigned as a virtual allocation platform that aggregates multiple warehouses into flexible channels, improving turnover while accepting cross‑warehouse shipments (Figure 8).

Figure 8: Virtual inventory allocation
Figure 8: Virtual inventory allocation

Cross‑warehouse transfers are modeled as a multi‑objective linear programming problem that considers current and forecast demand, routes, and timing.

Monitoring is critical; Xiaomi built an effective alert system that distinguishes between anomalies and alerts, using functions such as val(), count(), exist(), and expressions like val()>3 for anomaly detection, and times()>3 && limit(0) && snooze(30) for throttled alerting.

Summary

Currently, Xiaomi is migrating to a service‑oriented architecture built on Thrift, ETCD, Go, and PHP, with a custom SOA framework written in Go and lightweight adapters for non‑Go services, positioning service‑orientation as the future direction for its expanding technical ecosystem.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

e‑commerceSystem ArchitectureMicroservicesScalability
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.