Fundamentals 10 min read

Mastering Consistent Hashing: Balance, Monotonicity, and Minimal Data Shifts

Consistent hashing, introduced by MIT in 1997, addresses hotspot issues in distributed systems by ensuring balance, monotonicity, spread, and load properties, using a ring hash space, virtual nodes, and minimal data movement when nodes are added or removed.

Programmer DD
Programmer DD
Programmer DD
Mastering Consistent Hashing: Balance, Monotonicity, and Minimal Data Shifts

Consistent hashing, proposed by MIT in 1997, is a distributed hash (DHT) algorithm designed to solve hotspot problems in the Internet, similar to CARP but correcting its simple‑hash shortcomings so that DHT can be applied in P2P environments.

The algorithm defines four criteria for evaluating a hash function:

Balance : the hash results should be evenly distributed across all caches, fully utilizing storage space.

Monotonicity : when new caches are added, existing items must remain mapped to their original or newly added caches, never to other old caches.

Spread : in a distributed setting, different nodes may see only a subset of caches; the hash should minimize inconsistent mappings that cause the same content to be stored in different caches.

Load : from another angle on spread, the algorithm should keep the load on each cache low, avoiding many different items being mapped to the same cache.

In a distributed cluster, adding or removing machines (or handling machine failures) is a basic management operation. A simple hash(object) % N algorithm violates monotonicity because many existing data locations become invalid after a change.

Ring Hash Space

Keys are hashed into a space of 2^32 buckets (0 to 2^32‑1). By connecting the ends of this numeric range, we obtain a closed ring.

Objects are hashed and placed on the ring:

Hash(object1) = key1;
Hash(object2) = key2;
Hash(object3) = key3;
Hash(object4) = key4;

Machines are also hashed onto the same ring (typically using the machine’s IP or a unique identifier):

Hash(NODE1) = KEY1;
Hash(NODE2) = KEY2;
Hash(NODE3) = KEY3;

Objects and machines share the same hash space; each object is stored on the first machine encountered when moving clockwise from its position. This placement remains stable as long as the ring does not change.

Node Deletion and Addition

When a node fails (e.g., NODE2), only the objects that were mapped to that node need to move clockwise to the next alive node, minimizing data movement:

When a new node (e.g., NODE4) joins, only the objects that fall between the new node’s position and its predecessor need to migrate, again keeping movement minimal:

Achieving Balance with Virtual Nodes

“Virtual node” (virtual node) is a replica of a real node in the hash space; each real node corresponds to several virtual nodes, and these replicas are placed in the hash ring according to their hash values.

Introducing virtual nodes distributes objects more evenly. For example, with two replicas per real node, the ring contains four virtual nodes, leading to a balanced mapping:

The mapping then becomes:

object1 → NODE1‑1

object2 → NODE1‑2

object3 → NODE3‑2

object4 → NODE3‑1

Object lookup proceeds from the object’s hash to the nearest virtual node clockwise, then to the corresponding real node.

Virtual node hashes can be computed by appending a numeric suffix to the node’s IP, e.g.:

Hash("192.168.1.100#1"); // NODE1‑1
Hash("192.168.1.100#2"); // NODE1‑2
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemsload balancingconsistent hashingvirtual nodes
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.