Design and Evolution of Xiaomi's Notify Asynchronous Message System
This article details the three-stage evolution of Xiaomi's e‑commerce architecture, introduces the design of the Notify asynchronous message system built on Redis and MySQL, and explains the subsequent upgrades—including agent proxy, Go‑based modules, and MyCAT integration—to improve scalability, reliability, and performance.
To keep up with rapid business growth, Xiaomi's e‑commerce platform underwent several architectural changes, ultimately creating its own asynchronous messaging system called Notify.
Xiaomi Web Architecture Development
The platform’s evolution is divided into three phases: the initial startup stage with a simple two‑Web‑server and one‑DB‑server setup, the development stage where services were split into independent subsystems and a SOA approach with a custom X5 protocol was adopted, and the refinement stage where an asynchronous message queue became the central hub connecting most subsystems.
In the development stage, high coupling was reduced, but the reliance on synchronous interfaces introduced fragility, prompting the need for an asynchronous solution.
Notify Message System Design
After analyzing business changes, Xiaomi decided to build its own async message system based on Redis queues, MySQL storage, and a set of APIs for receiving messages. The design addresses four core problems: receiving, storing, delivering, and monitoring messages.
The system defines five main tables (biz, receive, biz_receive, biz_msg, receive_msg) and introduces a message‑splitting mechanism that allows a single business message to be duplicated for multiple subscribers.
The architecture includes an Api.notify interface for ingestion, a multi‑process Maker that copies messages into per‑subscriber Redis queues, a Sender that pushes messages to target systems and updates status via a Marker queue, and a Marker component that asynchronously writes back delivery results.
Additional features comprise message splitting, cold‑storage backup, multi‑process delivery, retry logic with exponential back‑off, asynchronous processing, and a query UI for debugging.
Notify System Upgrades
Facing massive traffic spikes during events like Double‑11 and Mi Fan Festival, the original PHP‑only implementation was refactored: an Agent proxy now buffers messages in a local DB before forwarding them, Go replaces performance‑critical modules (boosting API throughput by 21× and Sender/Marker capacities by 4×), and MyCAT provides distributed MySQL storage to eliminate the single‑instance bottleneck.
These changes dramatically improved the system’s ability to handle peak order volumes while maintaining reliability.
Conclusion
The article shares Xiaomi’s practical experiences in building and evolving an asynchronous message queue system, offering valuable insights for other companies seeking similar scalability and fault‑tolerance improvements.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.