Databases 19 min read

Understanding Distributed Data Consistency: CAP, BASE, and Transaction Solutions

This article explains why achieving data consistency in modern distributed systems is challenging, reviews ACID properties of local databases, discusses the CAP and BASE theorems, examines event ordering mechanisms, and compares practical solutions such as two‑phase commit, XA, local message tables, and MQ‑based transaction models.

JD Retail Technology

Apr 19, 2023

Understanding Distributed Data Consistency: CAP, BASE, and Transaction Solutions

In large‑scale distributed software systems, guaranteeing data consistency is a fundamental problem; classic solutions like Paxos, Raft, and ZAB were created to address this challenge.

Local database transactions achieve consistency through the ACID properties—Atomicity, Consistency, Isolation, and Durability—implemented in MySQL InnoDB using binlog, redo log, undo log, and MVCC.

Hardware I/O characteristics (sequential vs. random read/write speeds of HDDs and SSDs) explain why databases employ various log files and buffer pools to transform random accesses into sequential ones for performance.

The CAP theorem states that a distributed system cannot simultaneously provide Consistency, Availability, and Partition tolerance; systems must choose between CA, CP, or AP trade‑offs.

Building on CAP, the BASE theorem (Basically Available, Soft state, Eventually consistent) favors partition tolerance and availability, accepting eventual consistency over strict consistency.

Event ordering in distributed systems relies on logical clocks (e.g., Lamport timestamps) or physical timestamps, with logical clocks preferred due to clock synchronization challenges.

Common transaction solutions include two‑phase commit (2PC) and three‑phase commit (3PC), which suffer from coordinator failures and performance overhead, leading to alternatives such as XA, TCC, Saga, and reliable message‑based approaches.

The local message table pattern stores business data and messages in a single DB transaction, then asynchronously delivers messages via MQ, achieving eventual consistency through retries.

MQ‑based transactional messaging (e.g., RocketMQ) uses a three‑stage process—prepare, local transaction execution, and commit/rollback—to ensure atomicity between message sending and local DB changes.

Concurrency control issues arise when caching is combined with database writes and MQ calls, leading to pitfalls like long‑running transactions and cache‑database inconsistency.

Two mitigation strategies are presented: (1) cache‑through with explicit invalidation after DB writes, and (2) read‑write separation using change events or binlog subscription to asynchronously update the cache.

In summary, there is no universal solution for distributed data consistency; the appropriate approach depends on business requirements, with strict ACID‑based transactions suited for finance, while flexible, message‑driven models are often sufficient for e‑commerce scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems CAP theorem message queues Data Consistency BASE theorem transaction protocols

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.