Backend Development 11 min read

Common Cache Design Issues and Their Solutions in High‑Concurrency Systems

This article explains the evolution from memcache to Redis, outlines seven classic cache problems such as centralized expiration, cache penetration, avalanche, hot keys, large keys, data consistency, and concurrent pre‑heating, and provides practical mitigation strategies for each scenario.

Full-Stack Internet Architecture
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Common Cache Design Issues and Their Solutions in High‑Concurrency Systems

Cache Knowledge Overview

Cache design is a well‑known topic; early implementations used memcache , while modern systems prefer redis . Choosing the right storage based on business scenarios is straightforward, but understanding the nuances is essential.

In practice, developers often add a Redis client library and initialize a RedisTemplate bean, making the setup trivial.

If the workload is only tens or hundreds of concurrent requests, simple cache design may suffice, but for billion‑scale systems more careful planning is required.

Seven Classic Cache Problems

1. Centralized Cache Expiration

When a request misses the cache, the system queries the DB, warms the cache, and returns the data. If many keys share the same expiration time, they may all expire simultaneously, causing a sudden surge of DB traffic.

Solution: Randomize the expiration time by adding a random offset to the base TTL, so keys expire gradually instead of all at once.

2. Cache Penetration

Requests for non‑existent keys (e.g., malicious queries for missing forum posts) miss both cache and DB, leading to repeated DB hits.

Solution:

Store a special placeholder value in the cache for missing keys, so subsequent requests hit the cache.

Use a BloomFilter to pre‑check key existence; if the filter says the key does not exist, skip the DB lookup.

3. Cache Avalanche

A partial loss of cache nodes can make the entire caching layer unavailable.

Distributed caches typically use consistent hashing; when nodes fail, a rehash distributes load to remaining nodes. However, a traffic spike focused on a few keys can overload those nodes, causing a cascade of failures.

Solution:

Implement real‑time monitoring and automatic failover to restore service quickly.

Deploy multiple cache replicas across different racks to reduce single‑point overload.

4. Cache Hotspot

Sudden spikes on a popular key can overload the cache node that stores it.

Solution:

Detect hot keys using real‑time analytics (e.g., Spark) and split them into multiple sub‑keys like key#01 , key#02 , … distributed across nodes.

Clients randomly select one of the sub‑keys for each request.

5. Large Cache Keys

Storing overly large values can cause timeouts and network congestion; frequent updates to many fields also increase load.

Solution:

Set a size threshold and compress values that exceed it.

Assess the proportion of large keys and consider object pooling (e.g., Memcache) to pre‑allocate space.

Split large keys into smaller ones and manage them separately.

Assign reasonable TTLs to avoid unnecessary eviction of large keys.

6. Cache Data Consistency

Since cache is a transient store, data exists both in the DB and the cache, raising consistency challenges, especially with replicated hot keys.

Solution:

On cache update failure, retry; if still failing, push the key to a message queue for asynchronous compensation.

Use short TTLs and self‑healing mechanisms to reload fresh data after expiration.

7. Concurrent Pre‑warming (Cache Stampede)

When a cached entry expires, many concurrent requests may simultaneously query the DB, overwhelming it.

Solution:

Introduce a global lock; only the request that acquires the lock queries the DB and warms the cache, while others wait.

Create multiple cache replicas so that if one expires, others can still serve requests.

Conclusion

Effective cache design involves many techniques, but the core principle remains: maximize cache hits while ensuring data consistency.

Recommended reading:

MySQL Index Principles

Deep Dive into Java Concurrency (AQS)

ThreadLocal Memory Leak Demo and Analysis

backendrediscachinghigh concurrencycache design
Full-Stack Internet Architecture
Written by

Full-Stack Internet Architecture

Introducing full-stack Internet architecture technologies centered on Java

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.