Mastering MySQL‑to‑Elasticsearch Sync: 4 Strategies & Top Migration Tools
This guide compares four MySQL‑to‑Elasticsearch synchronization methods—synchronous dual‑write, asynchronous MQ‑based dual‑write, timer‑driven SQL extraction, and real‑time Binlog replication—and reviews popular CDC tools such as Canal, Alibaba Cloud DTS, Databus, and others to help you choose the right solution.
1. Introduction
In many projects MySQL serves as the primary business database while Elasticsearch (ES) is used for query workloads to achieve read‑write separation and handle massive, complex queries.
The key challenge is keeping MySQL and ES data synchronized. Below are four common synchronization approaches.
2. Synchronization Approaches
2.1 Synchronous Dual‑Write
Write to MySQL and ES simultaneously.
Advantages:
Simple business logic
High real‑time performance
Disadvantages:
Hard‑coded; every MySQL write must also contain ES code
Strong coupling between business and search layers
Risk of data loss if one write fails
Performance degradation due to extra ES writes
2.2 Asynchronous Dual‑Write (via MQ)
Use a message queue to decouple writes from multiple data sources.
Advantages:
High performance
Reduced data‑loss risk thanks to MQ consumption guarantees
Isolated source writes, easier to add new sources
Disadvantages:
Hard‑coded integration for new sources
Increased system complexity with a message broker
Potential latency because of asynchronous consumption
2.3 SQL‑Based Extraction (Timer)
For scenarios with lower real‑time requirements, a timer extracts changed rows based on a timestamp column.
Add a timestamp column that updates on any CRUD operation.
Leave existing CRUD code unchanged.
Run a scheduled job that scans the table for rows changed within the interval.
Write each changed row to ES.
Advantages:
No code intrusion, no hard‑coding.
No strong business coupling; original performance unchanged.
Simple worker implementation.
Disadvantages:
Lower timeliness; fixed polling interval introduces delay.
Additional load on the database; can be mitigated by polling a replica.
Classic solution: use Logstash to periodically query MySQL via SQL and write incremental changes to ES.
2.4 Real‑Time Binlog Synchronization
Leverages MySQL’s binary log to achieve real‑time, non‑intrusive sync.
Typical steps:
Read Binlog entries for the target tables.
Convert entries to MQ messages.
Implement an MQ consumer.
Consume each message and write it to ES.
Advantages:
No code intrusion or hard‑coding.
Existing systems remain unchanged.
High performance and business decoupling.
Disadvantages:
Complexity of building a Binlog‑based system.
If MQ is used, it inherits the same latency risk as the asynchronous approach.
3. Data Migration Tool Selection
The Binlog real‑time method is the most widely adopted, spawning several migration tools that subscribe to Binlog changes (CDC).
3.1 Canal
Parses database incremental logs and provides subscription/consumption, primarily for MySQL.
Canal pretends to be a MySQL slave, receives the master’s Binlog via the dump protocol, parses it into JSON, and forwards it to ES via TCP or MQ.
Core workflow: Binlog Parser extracts and pushes events; EventSink filters, routes, and processes data.
3.2 Alibaba Cloud DTS
Data Transmission Service supports migration, real‑time subscription, and synchronization across RDBMS, NoSQL, and OLAP sources.
Features include multi‑source support, high performance (up to 70 MB/s, 200 k TPS), high availability, and a visual management console. It is a paid service.
3.3 Databus
LinkedIn’s open‑source CDC system captures changes from MySQL and Oracle with low latency, supports transactional consistency, and offers unlimited replay for lagging consumers.
3.4 Other Tools
Flink : Distributed stream processing engine for bounded and unbounded data streams.
CloudCanal : Commercial data sync/migration product.
Maxwell : Emits row‑level changes as JSON without client code.
DRD : Alibaba’s distributed database middleware.
yugong : Migrates data from Oracle to MySQL.
4. Conclusion
This article presented four MySQL‑to‑ES synchronization strategies and introduced several popular migration tools to help you choose the most suitable solution.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
