Why Direct DB‑Cache Updates Fail and How Cache‑Aside Solves Consistency Issues
The article explains the pitfalls of updating databases and caches separately, illustrates race conditions that cause data inconsistency, and introduces the Cache Aside strategy as a safer approach that reduces inconsistency by deleting stale cache entries before reads.
Scenario Description
Suppose a piece of data exists both in a database and a cache. When you need to update this data, should you update the database first or the cache first? Both approaches have problems.
(1) Update the database first, then the cache
This can cause data inconsistency.
A updates the database to 123, but due to network latency the cache update is slow. Meanwhile B updates the database to 456 and immediately updates the cache to 456. When A's cache update finally arrives, the cache is set back to 123, leaving the database at 456 and the cache at the stale value 123.
Because the database update and cache update are not atomic, concurrent operations can interleave and cause inconsistency.
(2) Update the cache first, then the database
This can also lead to inconsistency.
The cache update succeeds, holding the latest data, but the subsequent database update fails and rolls back, leaving the database with old data while the cache shows the new value.
Both cases suffer from non‑atomic operations.
Cache Aside Strategy
When updating data, instead of updating the cache, we can delete the cached entry. On the next read, if the cache is missing, we fetch the data from the database and repopulate the cache. This is known as the Cache Aside (or cache‑bypass) strategy.
Read strategy steps
Write strategy steps
It is not advisable to delete the cache first before the database update completes, because another request may repopulate the cache with stale data.
Note: The Cache Aside strategy does not guarantee absolute data consistency; it significantly reduces the likelihood of inconsistency. In extreme cases where consistency is mandatory, a distributed lock can be added.
Recommended reading:
How to determine if an element exists in a dataset of billions?
Alibaba's open‑source distributed transaction framework Seata
Common strategies for high concurrency
Multi‑active active‑active architecture
Load balancing classifications and algorithms
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
