How Ele.me Achieved Sub‑Second MySQL Multi‑Active Replication with DRC
This article details Ele.me's design and implementation of a MySQL bidirectional replication component (DRC) that enables sub‑second, high‑throughput data synchronization across Beijing and Shanghai data centers, addressing latency, consistency, and failover challenges in a multi‑active environment.
Background of Multi‑Active Deployment
Ele.me's business services were originally hosted in a single Beijing data center with over 4,000 servers, while disaster‑recovery machines ran in the cloud. As order volume approached ten million per day, the Beijing IDC could no longer expand, prompting the construction of a new Shanghai data center slated for April.
The goal was to route northern user traffic to Beijing and southern traffic to Shanghai, while handling users who move between regions and ensuring rapid failover if a data center becomes unavailable.
Requirements for the Underlying Data Synchronization
Latency below one second for cross‑data‑center writes.
Strong data consistency – no loss or corruption.
High throughput to handle bursty batch operations, large DDL changes, and archival jobs.
These requirements drove the design of the DRC (Data Replication Component) that performs MySQL bidirectional replication.
Scale of the Existing MySQL Cluster (Pre‑Migration)
Before the multi‑active project, the Beijing data center hosted more than 250 MySQL clusters, over 1,000 MySQL instances, and more than 400 Redis clusters. DRC needed to support all of these clusters in both data centers.
Classification of MySQL Usage in a Multi‑Active Setup
Ele.me categorized databases into three types:
Multi‑Active DB : Services such as payment, order, and checkout that must remain available when traffic switches between data centers. DRC replicates these databases bidirectionally.
Global DB : Databases requiring strong consistency (e.g., user registration with unique constraints). These are writable only in a single data center and read‑only elsewhere; conflicts are resolved via a correction platform.
Non‑Multi‑Active DB : Backend management platforms that continue using traditional primary‑secondary replication.
Overall Architecture of DRC
The system consists of three main components:
Replicator Server : Extracts binlog events from MySQL masters, parses them into a custom, compact data format, and stores them in a terabyte‑scale event buffer.
Applier Server : Maintains a long‑lived TCP connection to the Replicator, receives events, replays transactions, and writes them to the target MySQL cluster via JDBC, employing concurrent processing and idempotence.
Console Control Node : Handles configuration, deployment, and high‑availability coordination for the components.
Key Design Details
The Replicator includes a MetaDB module to manage historical binlog parsing.
After parsing, events are filtered to remove unnecessary types (e.g., table_map_event) and data, reducing the payload to roughly 70% of the original size.
Loop‑replication is prevented by inserting a checkpoint record containing the source server_id into each transaction; the Applier uses this to identify and discard self‑generated events.
Idempotent processing ensures that repeated handling of the same event does not corrupt data.
Data Consistency Guarantees
Consistency is achieved through three mechanisms:
Traffic Partitioning : Users are routed based on geographic or business dimensions to minimize concurrent updates across data centers.
Loss‑Free Replication : In case of potential data loss, DRC rolls back to an earlier binlog position, accepting possible duplicate processing, which is mitigated by idempotence.
Conflict Resolution : Each table includes a hidden last‑update timestamp; when conflicts arise, the newer timestamp wins, ensuring deterministic resolution.
Low‑Latency Strategies
Sub‑second latency is maintained by aggressive concurrency. Initially a table‑level concurrency model was used, but it proved insufficient for skewed workloads. The current row‑level concurrency can handle high‑throughput scenarios without causing bottlenecks.
Master Failover Handling
When a MySQL master switches (e.g., via MHA), the DBA system notifies DRC's control center, which re‑routes the replication link to the new master. The old master is first set to read‑only to prevent stray writes.
Production Experience
After deployment, DRC operates roughly 400 unidirectional replication links, serving 17 business teams and processing over 100 million messages daily.
Performance snapshots show most links staying under 1 second latency, with occasional spikes up to 4 seconds during heavy archival or DDL operations.
Q&A Highlights
PK conflicts are avoided by using different offset increments for each data center.
DRC does consume IOPS on the target DB, impacting performance.
DRC writes via JDBC; alternative binlog‑server approaches were explored but abandoned due to complexity.
A checksum tool developed by the DBA team monitors real‑time data consistency.
Operational overhead increases for DBAs, who must consider DRC during archival jobs and DDL releases.
The presentation concluded with thanks and an invitation for further questions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
