How Ctrip Finance Built a Real‑Time Binlog‑Based Data Lake with MySQL‑Hive Sync
This article details Ctrip Finance's end‑to‑end data‑foundation architecture that uses MySQL binlog collection via Canal, Kafka streaming, Spark‑Streaming persistence to HDFS, and a merge process to produce timely MySQL‑Hive snapshots, addressing performance, consistency, and delete‑handling challenges.
Background
In September 2017 Ctrip Finance needed a unified data center to synchronize thousands of MySQL tables across multiple data centers into both offline and online warehouses. The existing DataX‑based MySQL‑Hive batch sync could not satisfy latency, accuracy, idempotence, or DELETE‑support requirements.
Batch size growth caused performance bottlenecks and delayed downstream tasks.
Heavy batch reads slowed down the online MySQL service.
Inability to guarantee idempotent results for rapidly changing rows.
No native DELETE handling, forcing full‑table reloads.
Solution Overview
A binlog‑driven real‑time data foundation was built with the following components:
Web UI for configuring binlog collection, MySQL‑to‑Hive, real‑time warehouse, and cache mirroring.
Canal for capturing MySQL binlog and publishing to Kafka (multi‑region deployment with dedicated links).
Spark‑Streaming to persist binlog messages to HDFS.
Merge job to combine incremental HDFS data with existing snapshots.
Mirror service to update real‑time warehouses and caches.
Core services for historical replay, data validation, end‑to‑end monitoring, and plaintext detection.
Detailed Design
Binlog Collection
Canal (open‑source MySQL binlog subscriber) was extended to emit a lightweight SimpleBinlog format. Canal‑admin provides a Web UI for instance‑level management; high availability is achieved via Zookeeper temporary nodes.
Kafka topics are created per MySQL instance (not per database) to limit partition count. The partition key is schemaName+tableName, ensuring ordered delivery per table.
max.in.flight.requests.per.connection=1 retries=0 acks=allTopic configuration example:
topic partition 3 replicas, min.insync.replicas=2SimpleBinlog schema fields: binlogOffset: global sequence ID (timestamp+seq) for total ordering. executeTime: binlog execution timestamp. eventType: INSERT / UPDATE / DELETE. schemaName and tableName: logical database and table identifiers (prefixes can be stripped later). source: BINLOG (real) or MOCK (historical replay). version: message version. content: column list (after for INSERT/UPDATE, before for DELETE).
Four Canal clusters (each with two nodes) are deployed across DR sites, handling hundreds of MySQL instances. Prometheus metrics are scraped and pushed to an internal Watcher system; binlog latency is a key KPI. A split‑brain issue was mitigated by terminating the stale active instance on Zookeeper timeout.
Historical Data Replay
Two scenarios require replaying historical data: initial snapshot creation and recovery after data loss. Two approaches were evaluated:
Batch extract from MySQL, load to HDFS, then convert to SimpleBinlog.
Stream the batch as SimpleBinlog messages into Kafka, reusing the real‑time pipeline.
The streaming approach was chosen. A binlog‑mock service queries MySQL (e.g., 10,000 rows per batch), assembles SimpleBinlog messages, and publishes them to Kafka with timestamps shifted 5 minutes earlier to avoid overwriting real‑time data.
Key considerations for mock data:
Use a binlogOffset timestamp earlier than real data to preserve ordering.
Assign partition based on executeTime, defaulting to ${yesterday}.
Write to HDFS
Spark‑Streaming consumes Kafka SimpleBinlog messages and writes them to HDFS in 5‑minute micro‑batches. Offsets are committed only after successful persistence, providing an at‑least‑once guarantee.
Data is partitioned by logical database/table prefixes to mask sharding:
base_path/ods_binlog_source.db/${database_prefix}_${table_prefix}/dt={binlogDt}/binlog-{timestamp}-{rdd.id}To mitigate data skew, large tables detected in a batch are split across multiple HDFS files:
base_path/ods_binlog_source.db/${database_prefix}_${table_prefix}/dt={binlogDt}/binlog-{timestamp}-{rdd.id}-[${randomInt}]Snapshot Generation
Data Readiness Check
Every day at 00:15 the system verifies that all SimpleBinlog data for the previous day (T‑1) has landed in HDFS. Three latency metrics are monitored: t1: binlog capture latency. t2: Kafka replication latency. t3: Spark‑Streaming consumer lag.
If the current time t4 satisfies t4 > t1 + t2 + t3, the downstream merge job can start.
Merge Process
The daily merge job performs:
Load SimpleBinlog data for partition T‑1.
Fetch the latest MySQL schema; if changed, update the Hive snapshot.
Extract incremental rows (delta) from SimpleBinlog based on primary key.
Identify DELETE events that belong to normal business edits versus bulk archival (threshold‑based).
Union delta with the previous snapshot (snap T‑2), de‑duplicate by id using row_number ordered by binlogOffset descending, keeping the newest record as the new snapshot (snap T‑1).
Data Check
After merge, a consistency check compares Hive and MySQL row counts and key fields (typically a 7‑day window on createTime). Known edge cases include:
Binlog of day T falling into partition T‑1, causing off‑by‑one‑second mismatches.
Table migration where the source stops updating, leading to silent drift detectable via volatility monitoring.
Other Enhancements
Additional data‑governance features can be integrated:
Plaintext detection to prevent sensitive data from entering the warehouse.
Standardization such as ID mapping or ciphertext conversion.
Metadata management enabling bidirectional lookup between MySQL and Hive for lineage tracking.
Conclusion
The binlog‑based foundation achieved full MySQL‑to‑Hive ODS replication within 1.5 hours, real‑time Kudu mirrors for operational analytics, and a multi‑active Redis cache serving up to one million requests per minute. Future work includes one‑click configuration, intelligent anomaly detection and recovery, and richer metadata management.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
