Mastering Distributed System Fundamentals: Models, Replication, Consistency, and Protocols
This article provides a comprehensive overview of distributed system fundamentals, covering node modeling, replica concepts, consistency levels, data distribution strategies, centralized and decentralized replica protocols, lease mechanisms, quorum, two‑phase commit, MVCC, Paxos, and the CAP theorem, while analyzing their trade‑offs in availability, consistency, and partition tolerance.
Concepts
Model
In engineering projects a node is typically a process running on an operating system. The model treats a node as an indivisible whole; if a process consists of several relatively independent components it can be split into multiple logical nodes.
Replica
A replica (copy) provides redundancy for data or services. Data replicas store identical copies of the same data on different nodes so that a failed node can recover the data from another replica. Service replicas provide the same service without relying on local storage; they obtain required data from other nodes. Replica control protocols are the theoretical core of distributed systems.
Consistency Levels
Strong consistency : every read sees the most recent successful write.
Monotonic consistency : once a client reads a value, it never reads an older value later.
Session consistency : within a single client session, reads are monotonic, but there is no guarantee across sessions.
Eventual consistency : all replicas converge to the same state eventually, without a bounded time guarantee.
Weak consistency : reads may return stale values; the system provides no ordering guarantees.
Distributed System Principles
Data Distribution Methods
Three common ways to partition data across a cluster are:
Hash‑based distribution: a hash function maps a key to a bucket; the bucket determines the target node. Simple but suffers from poor scalability when the cluster grows and from data skew.
Range‑based distribution: keys are divided into contiguous intervals; each interval is assigned to a node or a group of nodes. Allows easy range queries and balanced load through dynamic splitting of hot intervals.
Chunk‑based (size‑based) distribution: the data set is treated as a sequential file and split into fixed‑size chunks that are placed on different nodes. Independent of key characteristics, thus avoiding data skew.
Consistent hashing improves hash‑based distribution by placing nodes on a logical ring and assigning each key to the next clockwise node. Virtual nodes (multiple logical nodes per physical machine) further smooth load balancing and simplify node addition or removal.
Basic Replica Protocols
Replica control protocols can be classified as centralized or decentralized.
Centralized Replica Control (Primary‑Secondary)
The primary‑secondary (leader‑follower) protocol uses a single coordinator (the primary) to order updates and propagate them to secondaries.
Clients send update requests to the primary.
The primary performs concurrency control and determines the order of updates.
The primary propagates the update to all secondary nodes.
Each secondary acknowledges the update.
The primary replies to the client with success or failure based on the acknowledgments.
Reading from the primary yields strong consistency; reading from any replica provides eventual consistency. If the primary fails, a new primary must be elected, which may cause a service pause of several seconds.
Lease Mechanism
A lease grants a node exclusive rights to cache data for a bounded time interval. The server issues a lease together with the data; while the lease is valid the server will not modify the data. After lease expiry the client must discard the cached copy. This design provides high fault tolerance because lease validity does not depend on continuous communication.
Quorum Mechanism
Writes must be acknowledged by at least W replicas; reads must contact at least R replicas, with the constraint W + R > N (where N is the total number of replicas). This guarantees that a read overlaps with a successful write, allowing the client to see the latest committed value.
Two‑Phase Commit (2PC)
2PC is a classic centralized protocol that provides strong consistency but suffers from poor availability and performance. The coordinator logs begin_commit, sends prepare to participants, collects votes, and then broadcasts either global_commit or global_abort. Failure handling requires log‑based recovery on both coordinator and participants.
Multi‑Version Concurrency Control (MVCC)
Each transaction creates a new version of the data. Reads can select a suitable version, allowing concurrent reads and writes without locking. MVCC is widely used in databases and can be applied to distributed storage.
Paxos Consensus
Paxos achieves strong consistency in a decentralized setting. A proposer suggests a value; a majority of acceptors must promise to accept it; the value is then chosen and learned by learners. The protocol proceeds in numbered rounds to avoid deadlock.
CAP Theorem
The CAP theorem states that a distributed system cannot simultaneously provide strong consistency (C), high availability (A), and partition tolerance (P). Different protocols make different trade‑offs:
Lease sacrifices some availability for consistency and partition tolerance.
Quorum balances all three properties.
2PC offers full consistency but poor availability and partition tolerance.
Paxos provides strong consistency with reasonable availability and partition tolerance.
Understanding these building blocks helps architects design systems that meet specific requirements for latency, durability, and scalability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
