Fundamentals 18 min read

Unlocking Distributed System Design: 20 Core Patterns Explained

This article distills the key design patterns behind distributed systems—covering replication, partitioning, consensus, and fault‑tolerance—by presenting each pattern’s problem statement, concrete solution, trade‑offs, and technical considerations, all illustrated with real‑world examples from projects like Kafka and Cassandra.

BirdNest Tech Talk

Dec 29, 2024

Unlocking Distributed System Design: 20 Core Patterns Explained

Introduction

The book analyzes open‑source projects such as Apache Kafka and Cassandra to extract a catalog of reusable design patterns for building reliable, scalable distributed systems. It treats each pattern as a Lego‑like building block, defining the problem it solves, the concrete implementation, comparisons with alternatives, and practical engineering concerns.

Data Replication Patterns

Write‑Ahead Log (WAL)

Problem: How to achieve atomic, durable updates on a single server.

Solution: Record every operation in a log before mutating data; apply changes only after the log entry is successfully persisted, enabling recovery after crashes.

Implementation notes: WAL provides atomicity, while consistency and isolation require additional concurrency controls such as locks.

Comparison: Unlike Event Sourcing, WAL focuses on durability rather than preserving a complete history of events.

Segmented Log

Problem: Managing a large write‑ahead log efficiently.

Solution: Split the log into multiple independent segments that can be archived or deleted individually.

Low‑Water Mark

Problem: Safely deleting old log entries after they have been replicated.

Solution: Use a low‑water mark indicating the highest log index that all replicas have applied; entries older than this mark can be removed. The mark may be based on snapshots or timestamps.

Heartbeat

Problem: Detecting node failures in a distributed cluster.

Solution: Nodes periodically broadcast heartbeat messages; missing heartbeats within a timeout trigger failure detection. Suitable for both small consensus‑based clusters and large gossip‑based clusters.

Technical note: When using a single‑socket channel, avoid head‑of‑line blocking.

Leader and Followers

Problem: Achieving consistent replication across nodes.

Solution: Elect a single leader to handle all writes and replicate them to followers, guaranteeing that all nodes see the same state.

Technical note: A single update queue can serialize requests, eliminating the need for complex locking.

Paxos

Problem: Reaching consensus over an unreliable network.

Solution: Execute the three‑phase protocol—Prepare, Accept, Commit—where proposers gather promises from acceptors before finalizing a value.

Core concept: Majority arbitration ensures that only one value is chosen.

Replicated Log

Problem: Ensuring all nodes apply updates in the same order.

Solution: Record updates in a replicated log and combine it with the Leader‑Follower pattern so the leader streams the log to followers.

Technical note: The leader must track the highest log entry that a majority may have already replicated.

Singular Update Queue

Problem: Serializing concurrent updates from multiple producer threads.

Solution: Funnel all update requests into a single queue processed by a dedicated thread, preserving FIFO order.

Technical note: Queue selection and back‑pressure handling are critical for throughput.

Request Waiting List

Problem: Tracking outstanding requests sent to many nodes and matching responses.

Solution: Maintain a waiting list where each request carries a correlation ID; callbacks are invoked when the matching response arrives.

Idempotent Receiver

Problem: Handling duplicate requests over an unreliable network.

Solution: Ensure the server can safely process the same request multiple times by checking if it has already been handled or by using unique identifiers.

Technical note: Expiration of stored client requests must be considered.

Follower Reads

Problem: Allowing follower nodes to serve reads while preserving consistency.

Solution: Followers may read data if the requested version is present; otherwise they wait until the leader has replicated the needed version.

Technical note: Reads must include a timeout to avoid indefinite blocking.

Versioned Value

Problem: Preventing data loss when concurrent writes occur.

Solution: Assign a monotonically increasing version number to each value; concurrent updates are distinguished by their version.

Version Vector

Problem: Detecting concurrent updates on different nodes.

Solution: Track per‑node version counters in a vector; the vector can determine causal relationships (happened‑before, concurrent, or after).

Core concept: Comparing vectors reveals the ordering of updates.

Data Partitioning Patterns

Fixed Partitions

Problem: Evenly distributing data across many nodes.

Solution: Divide data into a fixed number of logical partitions (usually a multiple of node count) and map each partition to a node using a hash function.

Technical note: Clients obtain the partition‑to‑node mapping via the Consistent Core.

Proportional Partitions

Problem: Maintaining balanced partitions when the number of nodes changes.

Solution: Dynamically adjust the number of partitions based on the current cluster size.

Key‑Range Partitions

Problem: Supporting range queries while partitioning by key.

Solution: Split the key space into contiguous ranges, each forming a partition.

Technical note: Automatic splitting of hot ranges handles data growth.

Transactional Key‑Value Store

Problem: Providing ACID guarantees in a distributed environment.

Solution: Use a two‑phase commit protocol to coordinate transactions across nodes.

Technical note: Different wait‑strategies such as Wound‑Wait or Wait‑Die can mitigate lock conflicts.

Multi‑Version Concurrency Control (MVCC)

Problem: Allowing concurrent writes without blocking reads.

Solution: Assign a timestamp to each version and keep multiple versions; reads can access older timestamps while writes create new ones.

Lamport Clock

Problem: Maintaining causal ordering of events.

Solution: Attach a logical timestamp to each event; if event A precedes B, then timestamp(A) < timestamp(B).

Technical note: Provides a partial ordering of events.

Hybrid Clock

Problem: Generating monotonic timestamps that also reflect real time.

Solution: Combine physical system time with a logical counter to produce hybrid timestamps, enabling both monotonicity and approximate real‑time ordering.

Technical note: Hybrid clocks can underpin multi‑version storage.

Clock‑Bound Wait

Problem: Ensuring read/write consistency when clocks are uncertain.

Solution: Each node maintains a clock‑bound and waits until the write’s timestamp falls within the past bound before completing the operation.

Technical note: Reads also wait until they can safely observe the latest committed data.

Distributed System Patterns

Consistent Core

Problem: Implementing a strongly consistent metadata store.

Solution: Deploy a consensus algorithm such as Raft to maintain a consistent key‑value store for metadata, membership, leases, and service discovery.

Technical note: Communicate with the core via a single‑socket channel.

Lease

Problem: Granting time‑bounded access to distributed resources.

Solution: Issue leases that expire after a fixed interval and must be refreshed periodically; useful for node registration and failure detection.

Technical note: Leases are typically stored in the Consistent Core.

State Watch

Problem: Notifying nodes when cluster state changes.

Solution: Register listeners for specific state changes; the Consistent Core triggers callbacks upon updates.

Technical note: Implemented by the core service.

Gossip Dissemination

Problem: Propagating node status information throughout the cluster.

Solution: Nodes periodically exchange state with peers; the gossip protocol achieves eventual consistency and aids fault detection.

Technical note: Gossip is an eventual‑consistency mechanism.

Emergent Leader

Problem: Electing a leader in a fully decentralized cluster.

Solution: Nodes communicate peer‑to‑peer to converge on a leader without a central coordinator.

Technical note: Must handle split‑brain scenarios.

Single‑Socket Channel

Problem: Building an efficient communication path between servers.

Solution: Use one socket for both sending and receiving requests, eliminating the overhead of multiple connections.

Technical note: This approach relies on blocking I/O.

Request Batch

Problem: Reducing the overhead of many small network requests.

Solution: Aggregate multiple client requests into a single batch, transmit the batch, and have the server unpack and process each request individually.

Overall, the book provides a systematic toolbox of patterns that address specific challenges in distributed system design, offering concrete implementations, trade‑offs, and real‑world usage scenarios that developers can apply to build robust, scalable services.

design-patterns distributed-systems fault tolerance replication Consensus Partitioning

Written by

BirdNest Tech Talk

Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.