Why Does Redis Crash? Understanding Eviction Strategies, Their Internals, and Monitoring Metrics
The article explains how Redis eviction policies work, why configuring maxmemory and a proper policy is essential to avoid OOM crashes, details each of the eight policies, shows practical configuration and monitoring commands, and dives into the source‑code implementation of LRU/LFU eviction.
Redis Memory Limits and the Need for Eviction
Redis runs on finite physical memory; when used_memory reaches maxmemory, Redis must free space or the Linux OOM killer will terminate the process.
If maxmemory is not set or set to 0, the default policy noeviction will cause write errors instead of freeing keys, which is effectively a self‑inflicted outage.
Eviction vs. Expiration
Expiration ( EXPIRE) only adds a TTL tag to a key; it does not free memory. The actual memory reclamation is performed by the eviction policy, which decides which keys to delete when memory is full.
Eight Built‑in Eviction Policies
noeviction : never delete keys; writes fail when memory is full.
allkeys-lru : evicts the least‑recently‑used key among all keys.
volatile-lru : evicts the least‑recently‑used key among keys with an explicit TTL.
allkeys-lfu : evicts the least‑frequently‑used key among all keys (Redis 4.0+).
volatile-lfu : evicts the least‑frequently‑used key among keys with a TTL.
allkeys-random : randomly evicts any key.
volatile-random : randomly evicts a key that has a TTL.
volatile-ttl : evicts the key whose TTL will expire soonest.
Choosing the Right Policy
For pure cache scenarios where every key has a TTL, use a volatile-* policy; volatile‑lfu is recommended for most cases because it bases eviction on access frequency rather than recency.
When you store permanent data (no TTL), choose an allkeys-* policy; allkeys‑lfu gives the best hit‑rate preservation.
Practical Configuration Steps
# Set a safe memory limit (e.g., 60% of a 16 GB machine)
config set maxmemory 10G
# Choose the eviction policy
config set maxmemory-policy volatile-lfu # or allkeys-lfu
# Increase sampling precision (default 5, 10 is a good trade‑off)
config set maxmemory-samples 10
# Verify settings
info stats | grep evicted_keys # should be 0 after a fresh config
info memory | grep -E "used|peak|mem"For a permanent configuration, edit redis.conf and set the same three directives, then restart the service:
maxmemory 10G
maxmemory-policy volatile-lfu
maxmemory-samples 10
# If AOF is enabled, ensure "appendonly yes" is presentMonitoring Eviction
evicted_keys – rising values indicate memory pressure.
mem_fragmentation_ratio – >1.5 suggests significant fragmentation.
Regularly check used_memory_human, mem_peak_human and evicted_keys via redis-cli info.
How Redis Implements Eviction (Source‑Code Walk‑through)
Redis does not run a dedicated eviction thread. The serverCron() function (executed every 100 ms) checks memory usage; if used_memory > maxmemory, it triggers the eviction logic.
The eviction process consists of four roles:
Memory Sentinel – serverCron() decides when eviction is needed.
LRU Decision Hand – evictLRUKeys() samples a few random keys, computes idle time using the lru field, and deletes the oldest.
LFU Decision Hand – evictLFUKeys() samples keys, reads the high‑16‑bit counter of the lru field, and deletes the least‑frequent.
Heat Accountant – lookupKey() updates the lru field (LRU timestamp or LFU counter) on every read/write, providing lazy updates without extra threads.
Key data structure ( redisObject) contains a 24‑bit lru field that stores either a timestamp (LRU) or a combined counter/timestamp (LFU):
typedef struct redisObject {
unsigned type:4;
unsigned encoding:4;
unsigned lru:24; // LRU timestamp or LFU counter+timestamp
int refcount;
void *ptr;
} robj;LRU Eviction Code (Simplified)
int evictLRUKeys(redisDb *db, int policy) {
dict *candidate = (policy == MAXMEMORY_VOLATILE_LRU) ? db->expires : db->dict;
robj *oldest_key = NULL;
unsigned long oldest_idle = 0;
for (int i = 0; i < server.maxmemory_samples; i++) {
dictEntry *de = dictGetRandomKey(candidate);
if (!de) continue;
robj *key = dictGetKey(de);
robj *val = dictGetVal(de);
unsigned long idle = estimateObjectIdleTime(val);
if (idle > oldest_idle) {
oldest_idle = idle;
oldest_key = key;
}
}
if (oldest_key) {
dbDelete(db, oldest_key);
return 1;
}
return 0;
}LFU Eviction Code (Simplified)
int evictLFUKeys(redisDb *db, int policy) {
dict *candidate = (policy == MAXMEMORY_VOLATILE_LFU) ? db->expires : db->dict;
robj *coldest_key = NULL;
uint16_t coldest_counter = UINT16_MAX;
for (int i = 0; i < server.maxmemory_samples; i++) {
dictEntry *de = dictGetRandomKey(candidate);
if (!de) continue;
robj *key = dictGetKey(de);
robj *val = dictGetVal(de);
uint16_t counter = LFU_GET_COUNTER(val->lru);
if (counter < coldest_counter) {
coldest_counter = counter;
coldest_key = key;
}
}
if (coldest_key) {
dbDelete(db, coldest_key);
return 1;
}
return 0;
}LFU counters are updated lazily on each access via updateLFU(), which uses a logarithmic increment and a decay mechanism executed periodically by serverCron().
Design Philosophy: Approximate Sampling + Lazy Updates
Approximate Sampling – instead of scanning the whole keyspace, Redis samples a configurable small number (default 5, often set to 10) and evicts the best candidate among them, achieving O(1) complexity.
Lazy Updates – the lru / lfu metadata is refreshed only when a key is accessed, avoiding background threads and lock contention.
This combination lets Redis handle millions of operations per second with sub‑100 µs eviction latency, which is why it powers high‑traffic services at companies like ByteDance, Alibaba, and Meituan.
Key Takeaways
Always set maxmemory to a realistic limit; the default of 0 is unsafe.
Prefer volatile‑lfu for TTL‑based caches and allkeys‑lfu for non‑TTL data.
Monitor evicted_keys and mem_fragmentation_ratio to catch pressure early.
Understand that eviction is a fast, request‑embedded operation, not a background cleanup thread.
Tech Freedom Circle
Crazy Maker Circle (Tech Freedom Architecture Circle): a community of tech enthusiasts, experts, and high‑performance fans. Many top‑level masters, architects, and hobbyists have achieved tech freedom; another wave of go‑getters are hustling hard toward tech freedom.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
