How to Prevent Cache Penetration Attacks with Bloom Filters and Null Caching
The article explains what cache penetration is, why it can crash databases under malicious traffic, and presents practical mitigation techniques such as Bloom filters, null-value caching, data pre‑warming, and request validation to protect Redis‑backed systems.
What Is Cache Penetration?
Cache penetration occurs when a client requests data that is absent both in the cache (e.g., Redis) and the underlying database. Each request first checks Redis, finds no key, and then queries the database, which also returns nothing. If an attacker repeatedly requests non‑existent keys, the database can become overwhelmed, leading to severe performance degradation or crashes.
Mitigation Strategies
2.1 Bloom Filter
The Bloom filter, introduced by Bloom in 1970, uses a long binary vector and multiple hash functions to quickly test whether an element might be in a set. By pre‑loading all possible query keys into the filter, a request can be rejected early if the filter indicates the key definitely does not exist, avoiding unnecessary cache and database lookups.
The filter works by setting bits to 1 for each hashed position of stored keys. During a query, the same hash functions are applied; if any corresponding bit is 0, the key is guaranteed absent. If all bits are 1, the key may exist (false positives are possible but rare).
To reduce false‑positive rates, you can:
Increase the length of the binary bit array, which spreads hashed values more sparsely.
Increase the number of hash functions, adding more distinguishing features.
Bloom filters store only hash results, not the original data, making them highly space‑efficient and fast. Implementations are available in libraries such as Guava’s BloomFilter class or Redisson’s RBloomFilter for Redis. When configuring the false‑positive rate, a typical value is around 1%; setting it too low incurs extra hash operations and can degrade performance.
2.2 Null‑Value Caching
If a query returns no data from the database, cache the empty result with a short TTL. Subsequent requests for the same missing key will be served directly from Redis, preventing repeated database hits.
Drawbacks include:
Massive caching of empty values under heavy attack can consume significant memory; a suitable TTL helps limit this impact.
When the underlying data later appears, the cached empty value may cause temporary inconsistency until it expires.
2.3 Additional Techniques
Other complementary methods to mitigate cache penetration:
Data Pre‑warming : Load hot data into the cache proactively to avoid cache misses for popular keys.
Access Validation : Perform request, data, permission, or blacklist checks before allowing a query to reach the cache layer.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ma Wei Says
Follow me! Discussing software architecture and development, AIGC and AI Agents... Sometimes sharing insights on IT professionals' life experiences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
