Fundamentals of Distributed Systems: Models, Replicas, Consistency, and Core Protocols
This article provides a comprehensive overview of distributed system fundamentals, covering system models, replica concepts, consistency levels, performance metrics, data distribution strategies, basic replica protocols, lease mechanisms, quorum, logging, MVCC, Paxos, and the CAP theorem.
1 Concepts
1.1 Model
In practice, a node is often a process on an operating system; the model treats a node as an indivisible whole, and a process can be split into multiple nodes if it consists of relatively independent parts.
1.2 Replica
A replica provides redundancy for data or services. Data replicas store the same data on different nodes, enabling recovery when a node fails. Service replicas provide identical services without relying on local storage.
Replica Consistency
Replica control protocols ensure that reads from different replicas return the same data under certain constraints, defining various consistency levels.
Strong consistency : every read sees the most recent successful write.
Monotonic consistency : a user never reads older data after seeing newer data.
Session consistency : within a session, a user never reads older data.
Eventual consistency : replicas converge to the same state eventually, without a guaranteed time bound.
Weak consistency : reads may see stale data without any time guarantee.
1.3 Metrics for Distributed Systems
Performance: throughput, latency, and concurrency (e.g., QPS).
Availability: proportion of time the system provides correct service despite failures.
Scalability: ability to increase performance linearly by adding more machines.
Consistency: degree to which replicas present the same data.
2 Distributed System Principles
2.1 Data Distribution Methods
Distributed systems solve problems that single machines cannot by partitioning data across multiple machines.
Hash Partitioning
Simple hash modulo mapping; suffers from poor scalability and data skew.
Range Partitioning
Data is divided by key ranges; dynamic splitting helps balance load.
Chunk (Data‑Volume) Partitioning
Data is split into fixed‑size chunks independent of content, avoiding skew.
Consistent Hashing
Maps nodes onto a ring; virtual nodes improve load balance and scalability.
Replica and Data Distribution
Replica placement affects fault tolerance, recovery speed, and scalability; managing replicas at the data‑segment level decouples them from physical machines.
Locality‑Aware Computation
Placing computation on the same machine as its data reduces network overhead.
2.2 Basic Replica Protocols
Replica control protocols coordinate reads/writes to achieve desired availability and consistency, trading off among the three according to CAP.
Centralized Protocols
A single coordinator manages updates and concurrency, simplifying design but creating a single point of failure.
Primary‑Secondary Protocol
The primary handles all writes and concurrency; secondaries replicate the data. Reads can be directed to the primary for strong consistency or to any replica for weaker guarantees.
Primary Election and Failover
When the primary fails, a new primary is selected via metadata servers; failover may incur a detection delay (≈10 s).
Data Synchronization
Secondary nodes catch up via log replay, snapshot copy, or discarding dirty data.
Decentralized Protocols
All nodes are peers; achieving strong consistency requires more complex algorithms (e.g., Paxos).
2.3 Lease Mechanism
Servers grant time‑bounded leases to clients when sending data; during the lease period the server promises not to modify the data, allowing safe caching. Lease expiration enables safe updates without coordination.
2.4 Quorum Mechanism
Writes succeed after being persisted on W out of N replicas; reads query R replicas with W+R>N to guarantee reading the latest committed version.
2.5 Logging Techniques
Redo logs record the result of each update before applying it to memory, enabling crash recovery by replaying logs. Checkpoints periodically dump the full state to reduce replay time.
0/1 Directory
Two directory copies with a master pointer allow atomic switches of active metadata.
2.6 Two‑Phase Commit (2PC)
A coordinator asks participants to prepare; if all vote commit, the coordinator sends a global‑commit, otherwise a global‑abort. 2PC provides strong consistency but suffers from poor availability and performance.
2.7 Multi‑Version Concurrency Control (MVCC)
Each transaction creates a new version; reads can select appropriate versions to achieve snapshot isolation.
2.8 Paxos Protocol
Decentralized consensus where a majority of acceptors must agree on a value; ensures strong consistency with high availability.
2.9 CAP Theorem
It is impossible for a distributed system to simultaneously provide strong consistency, high availability, and partition tolerance; real systems make trade‑offs (e.g., Lease favors C and P, Quorum balances all three, 2PC favors C, Paxos offers C + good A + good P).
For further reading, see the referenced architecture ebook collection.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.