Testing Consistent Hashing: Real‑World Node Scaling, Load Balance & Avalanche Risks

This article experimentally validates consistent hashing by adding and removing nodes in a simulated cluster, examines key redistribution, reveals load‑imbalance and avalanche vulnerabilities, and proposes solutions such as virtual nodes and adaptive replica factors, providing Python code examples and detailed observations.

ITPUB
ITPUB
ITPUB
Testing Consistent Hashing: Real‑World Node Scaling, Load Balance & Avalanche Risks

Node Expansion Test

A four‑node cluster (192.168.1.1‑192.168.1.4) is expanded with a fifth node (192.168.1.5). The script prints the number of keys that must be migrated and the specific key‑node pairs.

print '上线一个节点,需要迁移节点个数:{}'.format(len(list(set(addNodeMap).difference(set(oldMap)))))
print '需要迁移的key-node信息:',list(set(addNodeMap).difference(set(oldMap)))

Result: three keys are moved to the new node 192.168.1.5.

上线一个节点,需要迁移节点个数:3
需要迁移的key-node信息: ['testKey15_192.168.1.5', 'testKey36_192.168.1.5', 'testKey23_192.168.1.5']

Node Shrink Test

The cluster now has five nodes. Removing node 192.168.1.5 triggers a similar migration.

print '需要迁移节点个数:{}' .format(len(list(set(removeNodeMap).difference(set(addNodeMap)))) )
print '等待迁移的key信息:{}' .format(list(set(removeNodeMap).difference(set(addNodeMap))))

Result: three keys are migrated back to node 192.168.1.2.

下线节点['192.168.1.5'],需要迁移节点个数:3
等待迁移的key信息:['testKey23_192.168.1.2', 'testKey36_192.168.1.2', 'testKey15_192.168.1.2']

Conclusion from Basic Tests

The experiments confirm that adding or removing a single node only affects the keys whose hash falls in the interval between the departing/arriving node and its immediate neighbor on the hash ring. This matches the theoretical expectation of consistent hashing.

Discovering Issues

Issue 1 – Uneven Key Distribution

When different nodes are removed, the number of affected keys varies dramatically (e.g., 14 keys when removing 192.168.1.2 versus 3 keys when removing 192.168.1.5). This leads to load imbalance: some nodes become hot while others remain idle.

Root cause: too few physical nodes (few shards) for the given key set.

Issue 2 – Avalanche Risk

When a node goes down, all its keys are reassigned to the next node clockwise on the ring. If that node is already heavily loaded, the sudden influx can cause it to fail, triggering a cascade of failures across the cluster.

Proposed Solutions

Solution 1 – Physical Expansion

Adding more physical machines improves balance but may be cost‑inefficient for small key volumes.

Solution 2 – Virtual Nodes

Introduce multiple virtual nodes per physical machine. For example, with a replica factor of 2, each physical node appears twice on the ring (e.g., A‑01, A‑02). This increases the number of points on the ring, making key distribution more uniform without adding hardware.

class ConsistentHash(object):
    def __init__(self, nodes=None, replicaFactor=10):
        self.nodes = nodes
        self.replicaFactor = replicaFactor
        self.ring = {}
        self.sortedKeys = []
        self.addNodes(nodes)

    def addNodes(self, nodes):
        if nodes:
            for node in nodes:
                for number in range(self.replicaFactor):
                    replicaNode = "%s%s" % (node, number)
                    replicaNodeHashResult = hashlib.sha1(replicaNode).hexdigest()
                    intReplicaNodeHashResult = int(replicaNodeHashResult, 16)
                    modIntReplicaNodeHashResult = intReplicaNodeHashResult % (2 ** 32)
                    self.ring[modIntReplicaNodeHashResult] = node
                    self.sortedKeys.append(modIntReplicaNodeHashResult)
            self.sortedKeys.sort()

    def removeNodes(self, nodes):
        if nodes:
            for node in nodes:
                for number in range(self.replicaFactor):
                    replicaNode = "%s%s" % (node, number)
                    replicaNodeHashResult = hashlib.sha1(replicaNode).hexdigest()
                    intReplicaNodeHashResult = int(replicaNodeHashResult, 16)
                    modIntReplicaNodeHashResult = intReplicaNodeHashResult % (2 ** 32)
                    self.ring.pop(modIntReplicaNodeHashResult)
                    self.sortedKeys.remove(modIntReplicaNodeHashResult)

    def getNode(self, modKeyHashResult):
        for _modIntNodeHashResult in self.sortedKeys:
            if modKeyHashResult < _modIntNodeHashResult:
                return self.ring[_modIntNodeHashResult]
        return self.ring[self.sortedKeys[0]]

    def allocateKey(self, number):
        keyNodeMap = []
        for i in range(number):
            keyName = 'testKey' + str(i)
            keyHashResult = hashlib.sha1(keyName).hexdigest()
            intKeyHashResult = int(keyHashResult, 16)
            modKeyHashResult = intKeyHashResult % (2 ** 32)
            _node = self.getNode(modKeyHashResult)
            keyNodeMap.append(keyName + '_' + _node)
        return keyNodeMap

Re‑evaluation with Virtual Nodes

Running the same removal tests after adding virtual nodes shows that keys are now redistributed across multiple physical nodes, mitigating both load imbalance and avalanche scenarios.

下线节点['192.168.1.1'],需要迁移节点个数:15
等待迁移的key信息:['testKey27_192.168.1.5', 'testKey15_192.168.1.5', ...]
下线节点['192.168.1.2'],需要迁移节点个数:6
等待迁移的key信息:['testKey2_192.168.1.1', 'testKey10_192.168.1.1', ...]

Further Discussion

Even with virtual nodes, perfect uniformity is unattainable; the goal is relative balance. When key request rates differ, additional strategies such as per‑node replicaFactor tuning or dynamic key‑migration based on load metrics may be required. The optimal number of virtual nodes depends on hardware, workload, and acceptable computation overhead, and should be determined experimentally.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythonload balancingconsistent hashingvirtual nodes
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.