Understanding Cache: Principles, Consistency, Penetration, and Avalanche
This article explains the evolution and core concepts of caching in backend services, covering service‑cache interaction, hit rate, consistency challenges, granularity choices, cache penetration risks, and strategies to prevent cache avalanche through high availability and traffic control.
1. Service and Cache
Caching is introduced to bridge the speed gap between fast in‑memory processing and slower persistent storage (files or databases). By keeping frequently accessed data in memory, applications achieve lower latency. The concept of hit rate describes how often requested data is found in the cache; a miss leads to cache penetration and requires fetching from the persistent layer.
When data is updated, both the persistent store and the cache must be synchronized, raising the "first‑or‑second" problem, known as cache consistency. Updating the cache first can create stale data if the persistent write fails, while updating the store first leaves a window where the cache serves outdated information.
2. Cache and Updates
Consistency requirements can be classified into three levels:
Strong Consistency
Critical data such as transaction status must be updated atomically to guarantee immediate consistency across cache and storage, often using distributed transaction mechanisms.
Weak Consistency
For less critical information, short‑term inconsistencies (seconds or minutes) are acceptable; techniques like asynchronous updates or time‑based expiration can be employed.
Eventual Consistency
The system ensures that, after a period, cache and storage converge to the same state.
3. Cache Granularity
Granularity refers to the size of cached data blocks. Choosing the right granularity depends on the application architecture and use case—for example, caching whole user objects versus individual attributes, or storing data as binary blobs, JSON strings, or key‑value pairs.
4. Harm of Cache Penetration
Cache penetration occurs when requests miss the cache and hit the persistent layer, which can overload the backend. Misses happen because data is temporarily absent (lazy loading or expiration) or because the data never exists. High request volumes for non‑existent keys can cause service crashes.
Mitigation strategies include intercepting penetration with techniques such as bloom filters, soft‑delete flags for logically removed records, and distinguishing between normal miss traffic and malicious attacks.
5. Cache Avalanche
A cache avalanche happens when a large portion of cached entries expire simultaneously, flooding the storage layer with traffic and potentially causing a system outage.
Prevention measures include:
High Availability
Deploying master‑slave replication, read‑write separation, dynamic scaling, consistent hashing, and multi‑region disaster recovery (e.g., Redis Sentinel or cluster mode).
Service Governance (Rate Limiting & Circuit Breaking)
Control abnormal traffic spikes and protect core services by applying rate limits, circuit breakers, and degradation strategies for both cache and storage resources.
Staggered Expiration
Distribute expiration times of cached items to avoid a sudden surge of requests to the backend.
By combining these techniques, systems can maintain fast response times while safeguarding the underlying data stores.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
