Why Distributed Databases Face Consistency Challenges and How to Address Them
The article explains how distributed database systems appear as a single logical database but operate over a network of nodes, detailing failures, stale reads, consistency models, architecture anomalies, sharding, replication, and various read/write scenarios that illustrate the complexity of achieving strong consistency.
Distributed Database Systems Overview
A distributed database appears as a single logical system to the client, but physically consists of multiple network‑connected nodes that exchange data via protocols.
Failure Modes and Stale Reads
Network partitions (the “P” in CAP) reduce availability because nodes become isolated.
Node failures can create single‑point‑of‑failure situations when a data item has only one replica.
Stale reads occur when a read follows an update but returns an older value because there is no global clock.
Consistency Models
Research proposes several consistency guarantees to mitigate distributed inconsistency:
Linearizability
Sequential consistency
Causal consistency
External consistency (e.g., Google Spanner)
Architecture Anomaly Example
In a master‑slave MySQL deployment, transaction T writes value 95 on the master, then the master crashes before replication. After recovery the master commits 95 and asynchronously replicates it to the slave. Meanwhile the former slave may have been promoted and accepted a new write 98. If a front‑end proxy hides the dual‑master state, the client sees a single MySQL service, loses 95, and violates the “read‑your‑writes” guarantee, potentially causing phantom reads.
Sharding and Replication
To scale to massive data volumes, distributed databases split data into shards . Each shard is replicated; one replica acts as the Leader (or primary) and the others as Followers (e.g., Raft). Replication introduces additional consistency challenges.
Write Scenarios
W1 – Write a Single Shard : The transaction touches only one node, so traditional single‑node concurrency controls (2‑Phase Locking, Timestamp Ordering, MVCC) provide ACID guarantees.
W2 – Write Multiple Shards : This is a distributed transaction. Typical techniques include:
Two‑Phase Commit (2PC) for atomic cross‑shard commits.
Strong concurrency controls such as Strong Strict 2PL (SS2PL) or Optimistic Concurrency Control (OCC).
Systems like Google Spanner, CockroachDB, and Percolator combine linearizability or sequential consistency with 2PC to achieve varying levels of strong consistency.
Read Scenarios
R1 – Read a Single Shard : Consistency is guaranteed by the node’s own mechanisms. When multiple nodes are involved, R1 splits into:
R11 – Read from Leader : The Leader holds the latest committed version. MVCC may return an older version for uncommitted transactions; lock‑based protocols block until commit, ensuring the read sees the committed value.
R12 – Read from Follower : Two cases:
Strong synchronization (Leader waits for Followers before acknowledging) yields identical data on Leader and Followers.
Weak synchronization (e.g., majority‑based protocols) may return stale data from Followers, but because the read touches only one node it does not create cross‑node inconsistency.
R2 – Read Multiple Shards : Cross‑shard reads can be inconsistent without global coordination. Four patterns are identified:
All Leaders (LL) : Requires a global transaction manager or strong consistency to avoid anomalies.
Leader‑Follower (LF) and Follower‑Leader (FL) : Latency between Leader and Follower can cause divergent reads; a shared transaction state is needed.
All Followers (FF) : Similar coordination challenges as LF/FL.
When both transaction‑level and system‑level inconsistencies appear, strong consistency mechanisms (e.g., linearizability, Paxos/Raft consensus) must be employed.
Illustrative Diagrams
Figure 1 shows a multi‑replica scenario where network partitions or delays lead to divergent reads.
Figure 2 visualizes a NewSQL architecture with four shards (A‑D), each having three replicas (one Leader, two Followers), illustrating the complexity of consistency across reads and writes.
Conclusion
Distributed databases must address network and node failures, stale reads, and a spectrum of consistency anomalies introduced by sharding and replication. Designers need to understand the interaction between transaction‑level consistency (ACID, 2PC) and system‑level consistency (linearizability, Raft/Paxos) to build highly available, strongly consistent data services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
