How to Eliminate Double‑Write Consistency Problems with Message Queues and CDC
This article explores common data‑synchronization challenges such as double‑write consistency and atomicity issues across databases, Redis, Elasticsearch and Hadoop, and presents a generic solution using ordered message queues and change‑data‑capture middleware to ensure reliable, consistent updates.
Introduction
One day, Ah Xiong goes to an interview and is asked how he ensures atomicity when writing to both a database and Elasticsearch.
Interviewer: "Ah Xiong, introduce yourself!" Ah Xiong: "I work at an international e‑commerce company. You mentioned Elasticsearch—how do you sync data?" Interviewer: "Do you write to the database and Elasticsearch simultaneously? What if one succeeds and the other fails?" Ah Xiong: "I wait for the notification."
The scenario is fictional, but it raises a real problem: how to keep multiple data stores consistent.
The article discusses a generic data‑synchronization strategy, divided into three parts:
(1) Background introduction
(2) Drawbacks of double‑write
(3) Improved solution
Background
Initially, Ah Xiong's company used a single database. As traffic grew, they added Redis for caching and later Elasticsearch for full‑text search. When analytical workloads increased, they exported data to Hadoop.
All these stores contain related data, just in different formats. For example, a product record is stored as:
In Redis the same product is stored as the key product:pId:1 with the value:
{
"pId": "1",
"productName": "macbook"
}Thus the data is identical, only the representation differs.
Drawbacks of Double‑Write
Consistency issue
When two clients write different values to two data sources concurrently, the sources can diverge (e.g., one stores 1, the other stores 5), leading to permanent inconsistency unless a later update corrects it.
Atomicity issue
Both writes must either succeed together or fail together. With naïve double‑write, this cannot be guaranteed.
Improved Solution
Record every data change in order and push it to a message queue. Other systems consume the queue and apply changes, ensuring both consistency and atomicity.
If a consumer fails to apply a message (e.g., network error), it records its position and retries from that point, preserving atomicity.
In practice, the change extraction is performed by middleware such as Oracle GoldenGate or Canal for MySQL, and the queue is typically Kafka. This avoids direct double‑write and eliminates the associated consistency and atomicity problems.
Conclusion
The article examined common data‑synchronization problems in projects and presented a generic, queue‑based approach that can be applied across databases, caches, search engines, and big‑data platforms.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
