Databases 14 min read

Mastering MySQL‑to‑Elasticsearch Sync: 4 Strategies & Top Migration Tools

This article compares four MySQL‑to‑Elasticsearch synchronization methods—synchronous dual‑write, asynchronous dual‑write, SQL extraction, and Binlog‑based real‑time sync—evaluates their pros and cons, and reviews popular migration tools such as Canal, Alibaba DTS, Databus, Flink, CloudCanal, Maxwell, and DRDS.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Mastering MySQL‑to‑Elasticsearch Sync: 4 Strategies & Top Migration Tools

1. Introduction

In many projects MySQL serves as the primary business database while Elasticsearch (ES) is used as a query engine to achieve read‑write separation and handle massive, complex queries.

A critical challenge is keeping MySQL and ES data synchronized. This article outlines four common synchronization approaches and introduces widely used migration tools.

2. Data Synchronization Solutions

2.1 Synchronous Dual‑Write

The simplest method: write to MySQL and ES simultaneously.

Advantages:

Simple business logic

High real‑time performance

Disadvantages:

Hard‑coded; every MySQL write must also contain ES write code

Tightly coupled business logic

Risk of data loss if one write fails

Performance degradation as ES adds load to the system

2.2 Asynchronous Dual‑Write

For multiple data sources, a message queue (MQ) can be used to perform asynchronous writes.

Advantages:

High performance

Reduced data loss risk thanks to MQ consumption guarantees (e.g., retry on ES failure)

Isolation between data sources, facilitating future extensions

Disadvantages:

Hard‑coded integration; new sources require new consumer code

Increased system complexity due to the message middleware

Potential latency because MQ consumption is asynchronous

2.3 SQL Extraction (Timer‑Based)

When real‑time is not critical, a scheduled job can extract changed rows based on a timestamp column.

Add a timestamp column to relevant tables; any CRUD operation updates this field.

Leave existing application code unchanged.

Run a timer program that periodically scans the tables for rows whose timestamp changed within the interval.

Write the extracted rows to ES one by one.

Advantages:

No code intrusion; original application remains untouched

No strong business coupling; system performance unchanged

Simple worker implementation without CRUD considerations

Disadvantages:

Lower timeliness; synchronization delay depends on the timer interval

Additional polling load on the database; can be mitigated by using a replica for polling

Classic solution: use Logstash to periodically query MySQL via SQL and write incremental data to ES.

2.4 Binlog‑Based Real‑Time Sync

To achieve real‑time synchronization without code intrusion, MySQL's binary log (Binlog) can be leveraged.

Steps:

Read Binlog entries for the target tables.

Convert the entries into MQ messages.

Implement an MQ consumer.

Continuously consume messages and write each to ES.

Advantages:

No code intrusion or hard‑coding.

No changes required in the existing system.

High performance.

Business decoupling; no need to understand original business logic.

Disadvantages:

Complexity of building a Binlog processing system.

If MQ is used for delivery, it inherits the latency risk of asynchronous consumption.

3. Data Migration Tool Selection

The Binlog real‑time sync approach is the most common, spawning many migration tools that subscribe to Binlog changes and implement CDC (Change Data Capture) to forward INSERT, UPDATE, DELETE events downstream.

3.1 Canal

Canal parses incremental database logs and provides subscription/consumption capabilities, primarily for MySQL.

It mimics a MySQL slave to receive Binlog from the master, parses the binary data into JSON, and forwards it via TCP or MQ to downstream consumers.

Canal server requests a dump protocol from MySQL master.

Master pushes Binlog to Canal, which converts it to JSON.

Canal client listens via TCP or MQ and syncs data to ES.

The core flow includes a Binlog parser for extraction and an EventSink for filtering, routing, and processing.

3.2 Alibaba Cloud DTS

Data Transmission Service (DTS) supports data migration, real‑time subscription, and synchronization across RDBMS, NoSQL, and OLAP sources.

Features include multi‑source support, multiple transmission modes, high performance (up to 70 MB/s and 200 k TPS), high availability via clustered nodes, and a visual management console. It is a paid service.

3.3 Databus

Databus, open‑sourced by LinkedIn in 2013, is a low‑latency, reliable, transactional change‑capture system supporting multiple data sources such as MySQL and Oracle.

It pulls changes from database logs, delivers them to custom clients in real time, and offers features like unlimited replay, high scalability, and server‑side filtering.

3.4 Other Tools

Flink : Distributed stream processing engine for stateful computations on bounded and unbounded data streams.

CloudCanal : Commercial data synchronization product.

Maxwell : Simple tool that outputs database changes as JSON strings without requiring custom client code.

DRDS : Alibaba's distributed relational database middleware focusing on scalability, lightweight, flexible, stable, and efficient characteristics.

Yugong : Tool for migrating data from Oracle to MySQL.

4. Conclusion

This article presented MySQL‑to‑Elasticsearch synchronization strategies and surveyed common migration tools, helping readers choose the most suitable solution for their scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ElasticsearchmysqlBinlogdata synchronizationCDCMigration Tools
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.