Understanding Cache Penetration, Breakdown, and Avalanche with Redis and Bloom Filters

This article explains the concepts of cache penetration, cache breakdown, and cache avalanche in Redis, presents common mitigation techniques such as request validation, empty‑value caching, Bloom filters, mutex locks, and high‑availability strategies, and includes Java code examples for practical implementation.

Architect
Architect
Architect
Understanding Cache Penetration, Breakdown, and Avalanche with Redis and Bloom Filters

Redis is frequently used as a cache to handle massive read/write requests that traditional I/O devices cannot satisfy.

1. Cache Penetration

Cache penetration occurs when requests target data that does not exist in both the cache and the database, potentially allowing attackers to overload the database.

Common Solutions

Validation interception: check request parameters (e.g., reject IDs <= 0) at the API layer.

Cache empty data: store a short‑lived placeholder for nonexistent records.

Bloom filter: use a probabilistic data structure to quickly test key existence before querying the cache.

public Student getStudentsByID(Long id) {
    // 从Redis中获取学生信息
    Student student = redisTemplate.opsForValue().get(String.valueOf(id));
    if (student != null) {
        return student;
    }
    // 从数据库查询学生信息,并存入Redis
    student = studentDao.selectByStudentId(id);
    if (student != null) {
        redisTemplate.opsForValue().set(String.valueOf(id), student, 60, TimeUnit.MINUTES);
    } else {
        // 即使不存在,也将其存入缓存中
        redisTemplate.opsForValue().set(String.valueOf(id), null, 60, TimeUnit.SECONDS);
    }
    return student;
}

Bloom Filter Details

A Bloom filter provides O(1) query time using multiple hash functions and a bit array, consuming far less memory than a HashMap for billions of keys.

It can indicate possible existence (all bits = 1) or definite non‑existence (any bit = 0). Note that Bloom filters do not support deletion because clearing a bit could affect other keys.

Comparison of Empty‑Data Caching and Bloom Filters

When malicious requests generate many distinct keys, caching empty data is inefficient; Bloom filters are preferable.

When the set of missing keys is limited and frequently repeated, caching empty data works well.

2. Cache Breakdown

Cache breakdown happens when a hot key expires and many threads simultaneously query the database, causing a sudden load spike.

Solutions

Set hot data to never expire.

Use a mutex lock to ensure only one thread loads the data from the database while others wait.

public String get(String key) {
    String value = redis.get(key);
    if (value == null) { // cache miss
        // set a short lock to avoid race conditions
        if (redis.setnx(key_mutex, 1, 3 * 60) == 1) { // lock acquired
            value = db.get(key);
            redis.set(key, value, expire_secs);
            redis.del(key_mutex);
        } else {
            // another thread is loading the data, retry after a short pause
            sleep(50);
            return get(key);
        }
    } else {
        return value;
    }
}

3. Cache Avalanche

A cache avalanche occurs when a large portion of cached data expires or a Redis cluster fails, causing massive traffic to hit the database and potentially crash the system.

Mitigation Strategies

Pre‑incident: Deploy high‑availability Redis (Sentinel or Cluster) to survive node or data‑center failures.

During incident: Apply cache degradation or circuit‑breaker patterns (e.g., Hystrix) to limit load on downstream services.

Post‑incident: Perform Redis backup and fast cache warm‑up to restore normal operation.

—END—

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendjavaperformanceCacheredisbloom-filter
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.