How to Ensure Cache‑Database Consistency: Strategies and Best Practices
This article explains why introducing a cache improves read performance as traffic grows, compares different cache‑database consistency approaches, analyzes their pros and cons in concurrent scenarios, and recommends using the "update database then delete cache" pattern with message queues or change‑log subscriptions to maintain data integrity.
Introducing Cache to Improve Performance
If your business is in the early stage with very small traffic, you can directly operate the database for both read and write requests. As traffic grows, reading from the database each time becomes a performance bottleneck, so you introduce a cache, changing the architecture as shown.
How to Increase Cache Utilization?
To maximize cache utilization, we usually keep only the most recently accessed "hot data" in the cache.
Write requests still write only to the database.
Read requests first check the cache; if the cache miss occurs, read from the database and rebuild the cache.
All data written to the cache is given an expiration time.
How to Ensure Cache‑Database Consistency?
There are four common ways to update cache and database:
Update cache first, then update database.
Update database first, then update cache.
Delete cache first, then update database.
Update database first, then delete cache.
Update Cache, Then Update DB
This approach is generally not considered because if the cache update succeeds but the database update fails, the cache and database become completely inconsistent and the problem is hard to detect.
Update DB, Then Update Cache
If the database update succeeds but the cache update fails, the database holds the latest value while the cache still contains the old value; only after the cache expires can the correct value be read from the database. This scenario is also usually avoided.
Concurrency‑Induced Consistency Issues
When two concurrent requests A and B update data, the execution order may become scrambled, leading to unexpected results. For example, thread A updates the database, thread B updates the database, thread B updates the cache, then thread A updates the cache, causing the final cached value to be incorrect.
In a "delete cache then update DB" scenario, concurrent read‑write operations can still produce inconsistency.
Delayed Double Delete Strategy
The simplest solution is the delayed double‑delete strategy: (1) delete the cache, (2) write to the database, (3) sleep for a short period (e.g., 1 second), then delete the cache again to remove any stale data that may have been written during the interval.
The delay should be longer than the replication lag in a master‑slave setup and longer than the time it takes for a read‑then‑write‑cache operation.
When Using Master‑Slave Replication
In a MySQL read‑write split architecture, the master‑slave delay can also cause inconsistency. A typical sequence is: thread A updates the master, deletes the cache; thread B reads the cache (miss), reads the stale slave value, writes it back to the cache, resulting in an outdated cache entry.
To mitigate this, either use the delayed double‑delete with a delay that exceeds the replication lag, or force cache‑filling queries to read from the master.
Update DB, Then Delete Cache
This approach can still fail if the cache deletion fails after the database update, leaving stale data in the cache. A common remedy is to use a message queue to retry the cache deletion.
Another solution is to subscribe to the database change log (e.g., MySQL binlog) and delete the corresponding cache entry when a change is detected.
Summary
After introducing a cache, you must address consistency between cache and database. Options include "update database + update cache" and "update database + delete cache". The latter, combined with message queues or change‑log subscriptions, provides a reliable way to keep data consistent in high‑concurrency environments.
While "update database + delete cache" can still suffer from replication lag in read‑write split setups, using a delayed double‑delete or forcing reads from the master can mitigate the issue.
For low‑concurrency data (e.g., personal orders), occasional inconsistency is tolerable; setting an expiration time and periodic refresh may suffice.
If strict consistency is required, employ distributed read‑write locks to serialize writes while allowing lock‑free reads.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
