How Consistent Hashing Eliminates Cache Avalanches in Distributed Systems
This article uses a classic distributed caching scenario to explain consistent hashing, detailing its principles, advantages, challenges like cache avalanche, and solutions such as virtual nodes, while illustrating each step with clear diagrams and examples.
Abstract: This article first presents a classic distributed cache scenario, then vividly introduces the consistent hashing algorithm, clearly outlining its benefits, existing problems, and corresponding solutions.
1. Introduction
Before learning consistent hashing, understand a caching scenario: three cache servers (0, 1, 2) store 30,000 images, aiming for roughly 10,000 images per server to balance load and improve speed, user experience, and reduce backend pressure.
Using a simple modulo hash ( hash(image_name) % N) assigns each image to a server, but when server count changes, all cached entries may become invalid, causing a cache avalanche.
2. Basic Concept of Consistent Hashing
Consistent hashing also uses modulo, but instead of modulo the number of servers, it takes modulo 2^32, visualized as a large hash ring.
The ring’s points represent values from 0 to 2^32-1. Each server’s IP address is hashed and mapped onto the ring, e.g., hash(serverA_IP) % 2^32. The same is done for servers B and C.
Objects (e.g., image names) are also hashed onto the ring using hash(image_name) % 2^32. An object is stored on the first server encountered when moving clockwise from its position.
3. Advantages of Consistent Hashing
When a server fails or is removed, only the objects that were mapped to that server need to be reassigned to the next clockwise server, leaving the majority of cached data untouched. This limits cache invalidation to a small subset, preventing a full‑scale cache avalanche and reducing pressure on backend services.
4. Skew in the Hash Ring
In practice, servers may not be evenly spaced on the ring, leading to skew where many objects map to a single server, causing load imbalance and higher risk of failure.
5. Virtual Nodes
To mitigate skew, each physical server is represented by multiple virtual nodes on the hash ring. Adding virtual nodes spreads objects more evenly across servers, improving load distribution and resilience.
With virtual nodes, the example shows a balanced allocation: images 1 and 3 on server A, images 5 and 4 on server B, and images 6 and 2 on server C. Adding more virtual nodes further reduces the chance of skew.
—
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
