Fundamentals 13 min read

Key Distributed System Techniques: Bloom Filter, Consistent Hashing, Quorum

This article explains fundamental distributed‑system mechanisms—including Bloom filters for space‑efficient membership tests, consistent hashing for scalable data placement, quorum requirements for operation safety, leader‑follower coordination, heartbeats, fencing, write‑ahead logging, log segmentation, high‑water marks, leases, gossip protocols, failure detection, split‑brain resolution, checksums, CAP and PACELC theorems, hinted handoff, read repair, and Merkle trees—providing a comprehensive overview for engineers.

21CTO

Nov 22, 2022

Key Distributed System Techniques: Bloom Filter, Consistent Hashing, Quorum

1. Bloom Filter

Bloom filter is a space‑efficient probabilistic data structure used to test whether an element belongs to a set, useful when only membership checks are needed.

In BigTable (and Cassandra), every read must access SSTables; if they are not in memory, many disk accesses occur. Bloom filters reduce disk I/O.

2. Consistent Hashing

Consistent hashing enables easy scaling by distributing data items across a ring based on hashed keys, assigning each item to the first node clockwise from its position. It provides incremental stability: only a node’s immediate neighbors are affected when a node joins or leaves.

3. Quorum

In distributed environments, a quorum is the minimum number of servers that must successfully complete an operation before it is considered successful.

Cassandra can be configured to require writes to be acknowledged by at least a quorum of replica nodes. Leader election in Chubby uses Paxos with quorum to ensure strong consistency. Dynamo uses a lax quorum for writes, unlike Paxos’s strict majority.

4. Leader and Follower

To achieve fault tolerance, data is replicated across multiple servers. One server is elected as the leader, making decisions for the cluster and propagating them to followers. In a three‑to‑five‑node cluster, leader election occurs at server startup; the system rejects client requests until a leader is chosen.

5. Heartbeat

The heartbeat mechanism detects whether the current leader has failed, triggering a new leader election if needed.

6. Fencing

When a leader fails, it may still appear active. Fencing isolates the former leader by preventing it from accessing cluster resources, either by resource fencing or node fencing (e.g., powering off the node).

Resource fencing: the system blocks the previously active leader from accessing resources needed for basic tasks.

Node fencing: the system blocks the previously active leader from accessing any resources, often by shutting down or resetting the node.

7. Write‑Ahead Log (WAL)

WAL records operations before they are applied to disk, inspired by database systems, allowing the OS to recover by replaying the log after a crash.

8. Log Segmentation

Splitting a log into multiple smaller files avoids performance bottlenecks of a single large file and simplifies cleanup; old segments are rolled over once a size limit is reached.

9. High‑Water Mark

The high‑water mark tracks the last log entry on the leader that has been replicated to a quorum of followers. Kafka uses this concept to expose only messages before the high‑water mark to consumers.

10. Lease

A lease acts like a lock with an expiration time; clients can renew it before it expires. Chubby clients hold time‑bounded session leases with the leader.

11. Gossip Protocol

Gossip is a peer‑to‑peer communication mechanism where each node periodically exchanges state information with a random peer, spreading updates throughout the cluster.

12. Phi Accrual Failure Detection

This algorithm adapts its failure threshold based on historical heartbeat intervals, outputting a suspicion level rather than a binary up/down status. Cassandra uses it to assess node health.

13. Split‑Brain

Split‑brain occurs when a distributed system has two or more active leaders. Using a generation clock (a monotonically increasing number) allows nodes to agree on the leader with the highest generation.

Kafka uses epoch numbers; HDFS relies on ZooKeeper to ensure a single active NameNode.

14. Checksum

Checksums detect data corruption during transfer. Cryptographic hash functions (MD5, SHA‑1, SHA‑256, SHA‑512) generate a fixed‑length checksum stored alongside the data. HDFS and Chubby store checksums with each file.

15. CAP Theorem

CAP states that a distributed system can provide at most two of the three guarantees: Consistency, Availability, and Partition tolerance. Dynamo is an AP system, sacrificing strong consistency for high availability; BigTable is a CP system, offering strict consistency.

16. PACELC Theorem

PACELC extends CAP: during a partition (P) a system trades Availability vs Consistency; otherwise (E) it trades Latency vs Consistency. The “ELC” part adds latency considerations to the classic CAP trade‑off.

17. Hinted Handoff

If a node is down, the leader records hints (the missed writes) on local disk. When the node recovers, the leader forwards the stored writes to it.

18. Read Repair

When reading, the system compares replicas and updates stale copies, ensuring eventual consistency. Cassandra and Dynamo employ read repair.

19. Merkle Trees

Merkle trees are hash‑based binary trees that enable efficient comparison of large data sets. By comparing root hashes and recursing only on mismatched subtrees, systems like Dynamo can detect and resolve inconsistencies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems scalability bloom filter consistency Consistent Hashing

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.