Backend Development 14 min read

Cache System Overview, Architecture Evolution, Pain Points, and Best Practices

This article explains the fundamentals of cache systems, describes the evolution from no‑cache to distributed and local caches, analyzes common challenges such as consistency, hot‑key detection, and cache avalanche, and provides practical guidelines and real‑world lessons for designing effective backend caching solutions.

政采云技术

Apr 25, 2023

Cache System Overview, Architecture Evolution, Pain Points, and Best Practices

1. Cache System Overview

As shown in the diagram, a basic network request flows from the client (browser or app) through the network, application service, and finally to storage (database or file system) before returning the result to the client.

With the rapid growth of internet traffic, applications face increasing concurrency and computational load, while server and database resources remain limited. Introducing a cache breaks the standard flow, allowing data to be retrieved directly from the cache at various stages, reducing computation and improving response speed.

Cache can be applied to the four stages shown, with slightly different strategies for each. This article focuses on stages 3 and 4.

2. Cache Architecture Evolution

2.1. No‑Cache Architecture

In a pure request, the client accesses the database directly, and the main bottleneck is the database's disk I/O.

2.2. Introducing Distributed Cache (e.g., Redis)

Adding a cache database such as Redis eliminates the database I/O bottleneck, allowing fast data returns. The cache prevents traffic from hitting the database and improves query speed.

2.2.1 Why Choose Redis?

Pure in‑memory operations, no disk I/O latency.

Key‑value store with O(1) access time, faster than typical database O(log n).

IO multiplexing thread model, non‑blocking during IO.

The new bottleneck becomes the network communication between the application service and the cache database.

2.3. Introducing JVM Local Cache

To solve the network latency of the distributed cache, a local cache can be added at the JVM level, creating a multi‑level cache hierarchy (local → Redis → DB).

Local cache returns data directly within the application, avoiding further Redis calls, but it increases memory usage and consistency complexity.

Requires redundant caches on multiple JVM machines, raising memory demands.

Maintaining data consistency across JVM instances is costly.

Consider the following three factors before adopting local cache:

Business QPS.

Available memory resources.

Frequency of data changes.

2.3.1 Data Retrieval Flow

Data is read in order of priority: local cache, Redis, then DB, achieving a three‑level cache architecture.

3. Pain Points and Optimizations

3.1 Data Consistency Issues

Multi‑level caches scatter data, making consistency challenging, especially for local caches across multiple JVM instances. An asynchronous solution using Canal + broadcast messages is illustrated.

DB updates data.

Canal listens to DB changes and triggers cache updates.

Redis cache is synchronized directly.

Local caches are synchronized via broadcast MQ.

3.1.1 Cache Synchronization Strategies

Delete cache directly; subsequent queries reload data (simple but may cause slow first query).

Pre‑load cache to keep it warm (higher query efficiency).

3.2 Hot‑Key Monitoring

Passive caching only loads keys after they are accessed, which can cause cache penetration under high concurrency. An effective system should proactively detect hot keys and pre‑warm them.

3.2.1 Hot‑Key Detection

Introduce a middle‑service for hot‑key detection.

Business instances report key access statistics to the detection service.

The service identifies hot keys and notifies instances.

Hot keys are pre‑warmed; non‑hot keys are released to save memory.

4. Cache Considerations

4.1 Key Design

Keep keys short to reduce memory usage.

Ensure high hit rate; low hit rate diminishes cache value.

4.1.1 Value Design

Keep values small; avoid big keys that block Redis (single‑threaded).

Cache only necessary fields and trim unnecessary data.

Prefer read‑heavy, write‑light data for caching.

4.1.2 Cache Penetration

Requests for non‑existent keys repeatedly hit the DB.

Cache empty values or default objects.

Use Bloom filters.

4.1.3 Cache Breakdown

When a hot key expires suddenly, massive traffic may flood the DB.

Apply rate limiting (e.g., Histrix protection).

Use mutex locks so only one thread accesses the DB and repopulates the cache.

4.1.4 Cache Avalanche

Simultaneous expiration of many keys overwhelms the DB.

Assign different expiration times or add random jitter.

5. Practical Experience

Estimate cache size to avoid exhausting Redis clusters or JVM memory.

Assess expected QPS; high Redis access may increase CPU due to serialization.

Avoid big keys; they degrade Redis throughput.

Always set expiration times.

6. Pitfalls

6.1 Local Cache Pollution

Modifying cached objects directly in JVM can corrupt cache data.

Copy objects when retrieving (CPU‑intensive, not recommended).

Make cached objects immutable (recommended).

6.2 Caching Response Objects Instead of Computation Results

Caching a failed response leads to subsequent cache hits returning failures.

Cache the actual computation result rather than the response wrapper.

6.3 JVM Memory Spike and Frequent Full GC

Introducing extensive local cache can cause high JVM memory usage and frequent full GC.

Evaluate memory impact before adding local cache.

Cache only hot keys.

6.4 Redis Cache Causing High CPU

When local cache is downgraded to Redis under high QPS, serialization and young‑GC increase CPU load.

Assess QPS before downgrading.

Cache only hot keys.

7. Summary

In computing, caching is ubiquitous and fundamentally trades space for time to accelerate data retrieval.

There is no universally optimal cache architecture; the best design aligns with business needs, considering cache hit rate, database pressure, data consistency, and overall system throughput.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

architecture Caching Local Cache consistency Backend Performance

Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.