How Xiaomi Built a Million‑User Flash‑Sale System in One Week

This article recounts Xiaomi's rapid development of a high‑concurrency flash‑sale platform for the 2014 "Mi Fan Festival", detailing the design choices, trade‑offs, and performance optimizations that enabled millions of users to purchase phones simultaneously.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
How Xiaomi Built a Million‑User Flash‑Sale System in One Week

2014 Xiaomi Fan Festival

In the early hours of April 9, 2014, the Xiaomi team performed final checks and rehearsals for the website's flash‑sale system, just hours before the most important large‑scale event of the year, the "Mi Fan Festival".

The festival was a milestone for Xiaomi e‑commerce, serving as a comprehensive stress test for the front‑end, back‑end, warehousing, logistics, and after‑sales processes.

At 10:00 am, a massive traffic surge of millions of users was expected to flood the servers, with the newly redeveloped flash‑sale system bearing the brunt of the load.

The system, freshly launched, faced its first real‑world trial to see whether it could sustain the pressure and correctly execute business logic.

By 9:50 am traffic was already climbing; at 10:00 am the flash‑sale system automatically opened, and the shopping cart successfully added the sale items.

Within minutes the popular items sold out and the system stopped the sale, proving its resilience.

How the Flash‑Sale System Was Born

At the end of 2011 Xiaomi released its first phone, which generated 300,000 pre‑orders in just over a day. The initial open‑sale on the Xiaomi store quickly overwhelmed the servers, causing database deadlocks and page timeouts.

With only a week to prepare for the next round of sales, the team faced a dilemma: continue optimizing the existing store or build a dedicated flash‑sale system. They chose the latter and set out to develop an independent solution within one week.

The goals were:

Complete design, development, testing, and deployment within one week.

Ensure the system runs smoothly; failure was not an option.

Guarantee reliable sale results.

Prevent overselling under massive concurrent demand.

Limit each user to one phone.

Provide the best possible user experience.

Given the need for simplicity and reliability, the team selected proven technologies: PHP for the back‑end and Redis for fast key/value storage.

To avoid the CAP trade‑off’s consistency bottleneck, they sacrificed strong data consistency in favor of availability and partition tolerance, handling data asynchronously.

Use PHP as the primary development language.

Simplify the purchase process to a single button click.

Minimize I/O per request.

Eliminate single‑point bottlenecks to allow linear scaling.

Handle data consistency asynchronously.

The core principle is illustrated in the first‑version architecture diagram:

The system uses a file on the PHP server to indicate whether a product is sold out. When a request arrives, the program checks the user's reservation status and the presence of the sold‑out flag. Successful reservations are logged and asynchronously sent to a central node for counting.

Successful purchase lists are later imported into the main store for order placement, shielding the store from the traffic spike.

Redis was chosen because the data fits a simple key/value model, offers in‑memory speed, and provides sufficient replication and persistence options.

Read/write operations to Redis are the most frequent I/O; to avoid bottlenecks the team employed read‑write splitting, using slave nodes for reads and a single master process for writes.

Short‑lived PHP connections could block Redis under peak load, so increasing the number of Redis slaves mitigated this risk. Writes are handled asynchronously by a dedicated process, and persistence is disabled on read slaves to prevent latency spikes.

The deployment comprised roughly 30 servers: 20 PHP instances and 10 Redis nodes, successfully handling the flash‑sale traffic.

Second‑Generation Flash‑Sale System

Two years later, the system was rebuilt for a larger "Mi Fan Festival" with multiple sale rounds and diverse products. The new design emphasized flexibility, operability, and scalability.

Key improvements included:

Isolation from the main store with a lightweight data exchange format.

Core components rewritten in Go for better memory efficiency and concurrency.

A two‑layer architecture: an HTTP service layer and a business‑logic layer, communicating via message queues.

The HTTP layer validates URLs, filters malicious traffic, provides captchas, enqueues user requests, and returns results from the business layer.

The business layer processes queued requests and pushes outcomes back to the HTTP layer.

Requests flow through the queue into Go workers, which handle them sequentially and return results, ensuring consistency of product inventory while sacrificing partition tolerance.

Additional modules handle strategy control, anti‑scraping, and system management, as shown in the second‑generation architecture diagram:

During development, the team encountered excessive memory consumption in Go's HTTP package. Each request allocated 8 KB for buffers, unnecessary for GET‑only traffic, leading to GC pressure and "avalanche" effects.

Solutions included:

Scaling out servers to keep memory usage below 50 %.

Customizing the HTTP package to reduce TCP read buffer size to 1 KB.

Expanding the buffer pool to one million entries.

Setting the Connection header to "close" to terminate idle TCP connections promptly.

These changes allowed the HTTP front‑end to sustain over one million stable connections.

The second‑generation system successfully passed the Mi Fan Festival stress test.

Conclusion

Technical solutions must be driven by concrete problems; without a real‑world scenario, even the flashiest technology loses value. The flash‑sale system continues to evolve as new challenges arise.

Author: Han Zhupeng, Xiaomi programmer. Early work on MIUI release and operations, later led the design and development of Xiaomi’s flash‑sale system.

Source: http://www.csdn.net/article/2014-11-07/2822545

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsredisGohigh concurrencyPHPflash sale
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.