Data Consistency in Distributed Systems: Master‑Slave, Master‑Master, and Leaderless Architectures
The article compares master‑slave, master‑master, and leaderless distributed architectures, explaining how synchronous, semi‑synchronous, and asynchronous replication affect consistency, latency and scalability, and showing that each pattern trades write throughput, conflict‑resolution complexity, and availability against strong data correctness.
Software development faces the classic "impossible triangle" of quality, cost, and time. Distributed systems break CPU and storage bottlenecks but inevitably introduce data‑consistency challenges.
When applications grow from monolithic to distributed deployments, concurrency improves dramatically, yet the shift from a single node to many nodes creates the risk that a write performed on one node is not immediately visible to reads on another node.
1.1 Data Storage and Reading
Databases are the most common middleware for persisting data. In a distributed scenario, the write‑read path can diverge, leading to situations such as a user sending a red‑envelope that the recipient cannot see because the write has not been propagated.
The article uses this illustration to motivate a deeper analysis of how storage nodes interact in distributed environments.
2 Data Storage Consistency
Ensuring that a read returns the "correct write" requires understanding the replication and coordination mechanisms among nodes. The discussion is organized by three architectural patterns.
2.1 Master‑Slave Architecture
In the classic master‑slave setup, the master handles writes while slaves serve read traffic. Three replication modes are common:
Synchronous replication – the master waits for all slaves to acknowledge before confirming the write.
Semi‑synchronous replication – the master waits for at least one slave to acknowledge.
Asynchronous replication – the master returns success immediately and slaves replicate later.
2.1.1 Synchronous Replication
Pros: strong consistency and high reliability. Cons: high network latency sensitivity; write throughput decreases as the number of slaves grows.
2.1.2 Semi‑Synchronous Replication
Pros: higher write efficiency, less sensitive to latency. Cons: if the acknowledged slave later fails, the write may be lost (phantom write), reducing reliability.
2.1.3 Asynchronous Replication
Pros: minimal impact on write latency, suitable for high‑throughput scenarios and cross‑region disaster‑recovery. Cons: potential for stale reads because slaves may lag behind the master.
2.1.4 Handling Node Failures
For slave failures, techniques such as Kafka’s ISR (in‑sync replica) list can be used to drop lagging replicas and let them catch up later. For master failures, election mechanisms must consider split‑brain avoidance, e.g., Redis parameters min‑slaves‑to‑write and min‑slaves‑max‑lag .
2.2 Master‑Master Architecture
Multiple nodes can accept both reads and writes, eliminating the single‑point‑of‑write limitation of master‑slave. Typical use cases include multi‑data‑center deployments, offline client synchronization, and collaborative editing.
However, concurrent writes can conflict. Conflict‑resolution strategies include:
Write‑conflict detection and blocking.
MVCC (multi‑version concurrency control) to isolate reads from writes.
Routing users to the nearest data center to reduce conflict probability.
Assigning timestamps or unique replica IDs to order writes.
Automatic merge rules or user‑driven merge decisions.
2.3 Leaderless (No‑Master) Architecture
All replicas can accept reads and writes. A write is considered successful once a configurable quorum of replicas acknowledges it (e.g., w out of n nodes). Reads also query a quorum ( r nodes). The condition w + r > n guarantees that at least one node in the read quorum has the latest write.
Challenges include:
If fewer than r nodes are alive ( n < r ), the system sacrifices availability for consistency (CAP trade‑off).
Clock skew can make ordering ambiguous.
Partial write successes can leave “dirty” data that cannot be rolled back.
Relaxed quorum strategies can improve availability at the cost of consistency.
3 Summary
The article examined three architectural patterns for achieving data consistency in distributed systems:
Master‑Slave – strong consistency with synchronous replication but limited write scalability.
Master‑Master – higher write scalability, but requires sophisticated conflict resolution.
Leaderless – maximizes availability and write throughput, yet consistency depends on quorum settings and may still suffer from edge‑case anomalies.
Choosing the appropriate architecture requires weighing business requirements against the trade‑offs of reliability, latency, and complexity.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.