How to Solve Cache‑Database Consistency Issues in High‑Concurrency Systems
This article examines common cache‑database consistency problems, explains why naive double‑write approaches fail, introduces the Cache‑Aside pattern, and proposes a queue‑based serialization solution with lazy cache updates to maintain data integrity under high‑traffic, concurrent read‑write workloads.
When using cache, double‑write to both cache and database can cause consistency problems; how can we resolve them?
Interview Question Analysis
If the system can tolerate occasional cache‑database inconsistency, it is better not to enforce strict read‑write serialization, which would serialize all requests into a memory queue and drastically reduce throughput.
Cache Aside Pattern
The classic cache‑database read/write model is the Cache Aside Pattern.
Read: first check the cache; if missing, read from the database, store the result in the cache, and return the response.
Write: update the database first, then delete the cache.
Cache deletion is preferred over cache update because in complex scenarios the cached value may be derived from multiple tables, and updating the cache can be expensive.
For example, if a field in one table changes, the corresponding cache may require data from two other tables and computation; updating such a cache on every change would be costly, especially when the cache is rarely accessed.
Basic Cache Inconsistency Issues and Solutions
Problem: update the database then delete the cache. If cache deletion fails, the database holds new data while the cache holds stale data.
Solution: delete the cache first, then update the database. If the database update fails, the cache remains empty, avoiding inconsistency; subsequent reads will fetch the old data from the database and repopulate the cache.
Analysis of Complex Data Inconsistency
When a change deletes the cache but the database update is still pending, a concurrent read may fetch stale data from the database and store it in the cache, leading to inconsistency after the database commit.
This scenario appears under high‑traffic, concurrent read‑write workloads where many requests access the same data.
Solution Overview
Route all update operations, identified by a unique key, to an internal JVM queue. Reads that miss the cache also enqueue a read‑plus‑cache‑update operation using the same key.
Each queue is serviced by a single worker thread that processes operations sequentially: first delete the cache, then update the database; if a read arrives while the update is pending, it waits for the queue to finish the cache refresh.
Duplicate cache‑update requests in the queue can be filtered to avoid unnecessary work.
High‑Concurrency Considerations
1. Read request blocking – Reads become asynchronous but must respect timeout limits; excessive update frequency can cause the queue to backlog, leading to read timeouts and fallback to direct database reads.
2. Read request overload – Sudden spikes of reads may cause many requests to wait; capacity planning and load testing are needed to ensure acceptable latency.
3. Request routing across multiple service instances – Updates and corresponding cache refreshes must be routed to the same instance (e.g., via Nginx hash routing) to maintain ordering.
4. Hot‑item routing skew – If a hot item’s read/write traffic concentrates on a single instance, that instance may become a bottleneck; distributing load or sharding keys can mitigate the issue.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
