Uncovering Memcached’s Slab Allocator: How Memory Is Managed and Optimized
This article explains Memcached’s slab allocator mechanism—including items, chunks, slab classes, and pages—how memory is allocated, fragmented, and evicted, and then explores master‑slave double‑layer architectures, high‑concurrency challenges, and scaling strategies such as L1 caching for robust, high‑availability deployments.
1. Memcached Memory Allocation Principles
Understanding Memcached installation and basic commands is enough for most development tasks, but diagnosing online issues requires deeper knowledge of its memory allocation management.
Memcached uses a Slab Allocator by default, which divides memory into fixed‑size blocks (chunks) to eliminate fragmentation.
Key terminology:
Item
A cache element measured in bytes, analogous to an object.
Chunk
The memory space used to store an item, similar to a storage compartment.
Slab Class
A group of chunks of a specific size, e.g., 80 B, 96 B, etc.
Page
A memory region (default 1 MB) allocated to a slab class, then split into chunks; think of a cabinet divided by slab class.
When a request arrives (e.g., a 123 B item), Memcached selects the smallest slab class larger than the item (180 B in the example) and allocates a page (1 MB) for that slab class.
The page is divided into 1 MB / 180 B ≈ 5828 chunks, allowing the 123 B item to be stored.
As memory fills, some slab classes may receive no pages, leading to uneven distribution.
When all slabs are full and a new item arrives, Memcached’s eviction mechanism activates. It first checks the relevant slab for expired items; if none, it evicts based on LRU within that slab. Pages, once assigned to a slab, are never reclaimed until Memcached restarts, a phenomenon known as the calcium problem .
High Concurrency & High Availability
2. Master‑Slave Double‑Layer Structure
Data sharding expands a single Memcached instance to a cache cluster, addressing port and throughput limits. However, node failures still push requests to the backend DB.
Consistent hashing mitigates loss but introduces challenges: high cache‑hit‑rate requirements (e.g., >99% for feed streams) and request drift when nodes become temporarily unreachable.
To solve single‑point failures, a master‑slave cache structure is introduced.
In the write path, the application performs dual writes to both master and slave. In the read path, the master is queried first; if it returns empty or fails, the slave is consulted.
To maintain consistency, updates are performed on the master using CAS; only after a successful CAS are the slave (and later L1 caches) updated. If CAS repeatedly fails, both master and slave are deleted, allowing subsequent requests to repopulate the cache.
2 Horizontal Linear Scaling
The double‑layer architecture resolves single‑point failures, but bandwidth saturation and request volume still limit scalability.
Increasing data replicas distributes load across multiple nodes. Adding an L1 cache layer beneath the master further enhances linear scaling.
Write operations follow a master‑slave‑L1 sequence; failures trigger delete actions and allow request‑through repopulation.
Read operations first select an L1 group, then hash to a specific node; if L1 misses, the system falls back to master and then slave, caching successful reads back into L1.
When traffic reaches a threshold, additional L1 groups are added, achieving linear capacity growth.
Even with the double‑layer design, a cold slave can still be a bottleneck. To address this, slaves are promoted to L1 resources, and masters are occasionally accessed via L1 to share hot traffic.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
