Unveiling Memcached’s Distributed Caching: Algorithms and Implementation

Under high‑concurrency loads, disk I/O becomes a bottleneck, prompting the use of caches; this article explains the fundamentals of caching, the role of memcached, and details its distributed implementation, covering simple modulo hashing, consistent hashing, and optimized virtual‑node techniques.

21CTO
21CTO
21CTO
Unveiling Memcached’s Distributed Caching: Algorithms and Implementation

Abstract

In high‑concurrency environments, massive read/write requests flood the database, making disk I/O a bottleneck and causing high response latency; therefore caching emerges as a solution. Both single‑node and distributed caches have their scenarios, with Redis and memcached being the most common. This article focuses on the distributed implementation principles of memcached.

Essence of Caching

Computer System Cache

According to the von Neumann architecture, a computer consists of processor, controller, memory, input and output devices. Modern CPUs contain arithmetic and control units, relying on storage that is organized in multiple levels. For example, a typical PC may have:

356 GB disk

4 GB RAM

3 MB L3 cache

256 KB L2 cache (pre‑core)

Besides these, there are registers and sometimes L1 cache inside the CPU. When the processor needs data, it first looks in the nearest L2 cache, which is the fastest and smallest due to its high cost.

These faster storage layers are collectively called cache, which speeds up data access.

Cache Application System

Extending the storage model to applications, data is first sought in the cache (fast storage) and, if missed or expired, retrieved from the database (slower storage). The workflow is illustrated below:

Introduction to memcached

What is memcached

memcached was originally developed by Brad Fitzpatrick of Danga Interactive, a LiveJournal subsidiary. It is now used by services such as mixi, hatena, Facebook, Vox, and LiveJournal to improve web application scalability. Traditional web apps store data in an RDBMS; as data volume and request concentration grow, the database becomes a performance bottleneck.

memcached is a high‑performance distributed in‑memory cache server designed to reduce database load, accelerate response times, and enhance scalability.

memcached Features

Simple protocol

Event handling based on libevent

In‑memory storage

Distributed architecture without inter‑node communication

memcached Distributed Principles

Since memcached nodes do not communicate with each other, distribution is achieved on the client side. The client determines which server stores a given key based on an algorithm, and uses the same algorithm for retrieval.

Modulo Hashing (Remainder Method)

The standard memcached distribution uses the remainder method: CRC($key)%N Issues:

If a selected server is unreachable, the client can append the attempt count to the key and re‑hash (rehash).

When adding or removing servers, the cache must be largely rebuilt, which is costly.

Consistent Hashing Algorithm

Consistent hashing maps each server to a point on a 0‑2³² ring, then maps each key to the ring and stores the data on the first server encountered clockwise.

When a new node (e.g., node5) is added, only the key range between node5 and its immediate predecessor is affected, minimizing redistribution.

Optimized Consistent Hashing

To avoid uneven key distribution, virtual nodes are introduced: each physical server is represented by multiple points on the ring, improving load balance even with few physical servers.

Conclusion

With an understanding of basic caching concepts, this article explained memcached’s distributed algorithms, which are implemented on the client side.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend Developmentconsistent hashingdistributed cachingMemcachedcache algorithms
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.