Design and Practice of Qunar Data Synchronization Platform: ES Multi‑Version Migration, High Availability, and Data Consistency
The article details Qunar's data synchronization platform that aggregates MySQL data into Elasticsearch, covering its architecture, component choices, ES5‑to‑ES7 migration, hot‑plugging, reindexing, high‑availability design, consistency guarantees, operational optimizations, and future roadmap.
Qunar's domestic ticket after‑sale services require complex queries across many MySQL tables; to serve these scenarios a data synchronization platform was built that aggregates data from MySQL into Elasticsearch, providing low‑latency, eventually consistent query capabilities.
Platform Overview
The platform consists of three layers: a data‑sync module (using Otter, Canal, and a custom DTS system), a data middle‑platform called crab that offers unified ES read/write, authentication, hystrix‑based flow control, and a management module for configuration, node lifecycle, and traffic distribution.
Key Components
Otter : an Alibaba open‑source distributed DB sync system, extended to publish messages to Kafka.
DTS : implements a SFTL pipeline (Source → Filter → Transform → Load) with Node, task, and DB reverse‑lookup components.
Crab : provides ES read/write APIs, unified auth, circuit‑breaker, and traffic‑shaping based on appcode+index dimensions.
Management : maintains sync configurations, DTS node status, crab auth/limit‑rate settings, and ES cluster flow control.
Technical Evolution Background
With over ten business lines and fourteen ES indices, the platform faced four major pain points: ES cluster single‑point failures, inability of Otter to switch DB IPs automatically, unclear end‑to‑end monitoring, and index‑level fault propagation. The goals were flexible scaling for ES5‑to‑ES7 migration and high‑availability across the whole sync chain.
Evolution Practices
ES 5.x and 7.x Parallel Support : The gateway now detects the ES version and routes requests to the appropriate endpoint, supporting both versions simultaneously.
Hot‑Plugging ES Clusters : A step‑by‑step procedure (reindex, crab write, diff补数, validation, query switch) enables smooth migration and failover.
REST API and DSL Differences : Elastic low‑level Java REST client was chosen for compatibility; Search, Document, and Script APIs were aligned across versions, with special handling for nested types and script storage.
Reindex Example :
curl -H "Content-Type: application/json" -XPOST http://ip:port/_reindex -d'{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "order_info_beta_tts8"
},
"dest": {
"index": "order_info_beta_tts8"
}
}'Canal Offset Migration and Diff Scheduled Tasks are used for partial or full back‑fill when recent data needs to be synchronized.
High Availability Design
The sync chain (Otter → Kafka → DTS → Crab → ES) achieves HA at each layer: Otter runs pipelines on multiple nodes with master‑slave Canal; Kafka uses replicated partitions; DTS runs multiple consumer nodes; Crab isolates traffic per index via Hystrix thread pools; ES indices are stored in multiple clusters with load‑balancing.
Data Consistency Guarantees
Ordered processing across the chain ensures per‑order data stays sequential.
Failed writes are sent to a retry Kafka topic and reprocessed after DB reverse‑lookup.
Diff‑based back‑fill tasks periodically reconcile minute‑level discrepancies for critical indices.
Operational Optimizations
Binlog deduplication by service‑order key reduces redundant writes.
Custom MySQL master‑slave switch for PXC architecture.
Dynamic batch size adjustment in Otter to avoid network saturation during large DDL operations.
Summary and Future Plans
The migration reduced query latency from 68 ms to 21 ms and write latency from 34 ms to 6 ms. Incident handling (ES node failure, Kafka disk full) demonstrated the platform's resilience. Future work includes making DTS aggregation fully configurable and automating failover migrations.
Recruitment Notice
Qunar is hiring interns to senior engineers across multiple positions; interested candidates are invited to apply.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Qunar Tech Salon
Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
