Fundamentals 55 min read

Fundamentals of Distributed Systems: Models, Replicas, Consistency, and Core Protocols

This article provides a comprehensive overview of distributed system fundamentals, covering system models, replica concepts, consistency levels, performance metrics, data distribution strategies, basic replica protocols, lease mechanisms, quorum, logging, MVCC, Paxos, and the CAP theorem.

Architects' Tech Alliance

Jan 26, 2021

Fundamentals of Distributed Systems: Models, Replicas, Consistency, and Core Protocols

1 Concepts

1.1 Model

In practice, a node is often a process on an operating system; the model treats a node as an indivisible whole, and a process can be split into multiple nodes if it consists of relatively independent parts.

1.2 Replica

A replica provides redundancy for data or services. Data replicas store the same data on different nodes, enabling recovery when a node fails. Service replicas provide identical services without relying on local storage.

Replica Consistency

Replica control protocols ensure that reads from different replicas return the same data under certain constraints, defining various consistency levels.

Strong consistency : every read sees the most recent successful write.

Monotonic consistency : a user never reads older data after seeing newer data.

Session consistency : within a session, a user never reads older data.

Eventual consistency : replicas converge to the same state eventually, without a guaranteed time bound.

Weak consistency : reads may see stale data without any time guarantee.

1.3 Metrics for Distributed Systems

Performance: throughput, latency, and concurrency (e.g., QPS).

Availability: proportion of time the system provides correct service despite failures.

Scalability: ability to increase performance linearly by adding more machines.

Consistency: degree to which replicas present the same data.

2 Distributed System Principles

2.1 Data Distribution Methods

Distributed systems solve problems that single machines cannot by partitioning data across multiple machines.

Hash Partitioning

Simple hash modulo mapping; suffers from poor scalability and data skew.

Range Partitioning

Data is divided by key ranges; dynamic splitting helps balance load.

Chunk (Data‑Volume) Partitioning

Data is split into fixed‑size chunks independent of content, avoiding skew.

Consistent Hashing

Maps nodes onto a ring; virtual nodes improve load balance and scalability.

Replica and Data Distribution

Replica placement affects fault tolerance, recovery speed, and scalability; managing replicas at the data‑segment level decouples them from physical machines.

Locality‑Aware Computation

Placing computation on the same machine as its data reduces network overhead.

2.2 Basic Replica Protocols

Replica control protocols coordinate reads/writes to achieve desired availability and consistency, trading off among the three according to CAP.

Centralized Protocols

A single coordinator manages updates and concurrency, simplifying design but creating a single point of failure.

Primary‑Secondary Protocol

The primary handles all writes and concurrency; secondaries replicate the data. Reads can be directed to the primary for strong consistency or to any replica for weaker guarantees.

Primary Election and Failover

When the primary fails, a new primary is selected via metadata servers; failover may incur a detection delay (≈10 s).

Data Synchronization

Secondary nodes catch up via log replay, snapshot copy, or discarding dirty data.

Decentralized Protocols

All nodes are peers; achieving strong consistency requires more complex algorithms (e.g., Paxos).

2.3 Lease Mechanism

Servers grant time‑bounded leases to clients when sending data; during the lease period the server promises not to modify the data, allowing safe caching. Lease expiration enables safe updates without coordination.

2.4 Quorum Mechanism

Writes succeed after being persisted on W out of N replicas; reads query R replicas with W+R>N to guarantee reading the latest committed version.

2.5 Logging Techniques

Redo logs record the result of each update before applying it to memory, enabling crash recovery by replaying logs. Checkpoints periodically dump the full state to reduce replay time.

0/1 Directory

Two directory copies with a master pointer allow atomic switches of active metadata.

2.6 Two‑Phase Commit (2PC)

A coordinator asks participants to prepare; if all vote commit, the coordinator sends a global‑commit, otherwise a global‑abort. 2PC provides strong consistency but suffers from poor availability and performance.

2.7 Multi‑Version Concurrency Control (MVCC)

Each transaction creates a new version; reads can select appropriate versions to achieve snapshot isolation.

2.8 Paxos Protocol

Decentralized consensus where a majority of acceptors must agree on a value; ensures strong consistency with high availability.

2.9 CAP Theorem

It is impossible for a distributed system to simultaneously provide strong consistency, high availability, and partition tolerance; real systems make trade‑offs (e.g., Lease favors C and P, Quorum balances all three, 2PC favors C, Paxos offers C + good A + good P).

For further reading, see the referenced architecture ebook collection.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

CAP theorem Replication Consistency Consensus

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.