Databases 17 min read

How to Ensure Data Consistency Between Database and Cache in High‑Concurrency Scenarios

This article examines the common data‑consistency problems that arise when updating both a database and a cache under high concurrency, evaluates four typical write‑order strategies, and presents the most reliable solution—writing to the database first and then safely invalidating the cache using retry, scheduled tasks, MQ, or binlog listeners.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
How to Ensure Data Consistency Between Database and Cache in High‑Concurrency Scenarios

Introduction

Database‑cache double‑write consistency is a language‑agnostic issue that becomes especially severe in high‑concurrency environments. The probability of encountering this problem in interviews or real projects is high, so it is essential to understand the typical solutions, their pitfalls, and the optimal approach.

Common Solutions

Four basic strategies are frequently used to keep the cache and database synchronized:

Write cache first, then write database.

Write database first, then write cache.

Delete cache first, then write database.

Write database first, then delete cache.

Write Cache First, Then Database

This approach seems straightforward: update the cache during the write operation. However, if the cache is written successfully but the subsequent database write fails (e.g., network outage), the cache holds "dirty" data that does not exist in the database, leading to severe inconsistency.

Therefore, writing the cache before the database is generally unsuitable for production.

Write Database First, Then Cache

Writing the database first avoids the "dirty cache" problem, but it introduces new challenges.

Write Cache Failure

If the cache write is placed in the same transaction as the database write and fails, the database update may be rolled back in low‑concurrency scenarios. In high‑concurrency systems, database and cache writes are remote operations; combining them in a single transaction can cause deadlocks, so they are usually kept separate. Consequently, if the database write succeeds but the cache write fails, the cache remains stale.

High‑Concurrency Issue

Consider two concurrent write requests (a and b) for the same record. Request a writes the database, then experiences a delay before writing the cache. Request b writes the database and cache successfully first. When request a finally writes the cache, it overwrites the newer value with the older one, causing inconsistency.

Resource Waste

Writing the cache after every database update can be wasteful when the cached value requires expensive computation, consuming CPU and memory resources unnecessarily, especially in write‑heavy scenarios.

Delete Cache First, Then Write Database

In this pattern, the cache is removed before the database update. While it can work, it still suffers from race conditions similar to the previous approach.

High‑Concurrency Issue

When a delete‑cache request (d) and a read‑cache request (c) occur simultaneously, the read may fetch stale data from the database and repopulate the cache before the delete completes, leaving the cache outdated.

Cache Double Delete

To mitigate the race, the cache is deleted twice: once before the database write and once after, with a short delay (e.g., 500 ms) before the second deletion. This gives any concurrent read‑then‑write operations time to finish and ensures the stale entry is removed.

Write Database Then Delete Cache

This approach is widely recommended because it avoids most inconsistency scenarios. After the database write, the cache is deleted. If a read occurs before the deletion, it may return stale data, but the subsequent delete removes that stale entry. The remaining risk is when the cache expires naturally and a read fetches an old value just before the delete, which is rare.

It is recommended to adopt the "write database then delete cache" strategy; although it cannot guarantee 100 % consistency, its failure probability is the lowest among the discussed methods.

What If Cache Deletion Fails?

If the cache deletion fails, a retry mechanism is required. A typical pattern is to retry up to three times synchronously; if all attempts fail, record the failure for later processing. For high‑throughput services, asynchronous retries are preferred, using background threads, thread pools, retry tables, message queues, or binlog listeners.

Scheduled Tasks

Asynchronous retries can be handled by scheduled tasks that periodically attempt to delete the cache. The task reads a retry table (which stores the number of attempts) and retries up to five times, marking the record as failed if all attempts fail. In high‑concurrency environments, a distributed scheduler like elastic‑job is suggested for sharding and efficient processing.

Message Queue (MQ) Approach

When a cache‑deletion failure occurs, an MQ message can be produced. A consumer retries the deletion up to five times; on persistent failure, the message is sent to a dead‑letter queue. RocketMQ is recommended because it natively supports retry and dead‑letter mechanisms, as well as delayed and ordered messages.

Binlog Listening

Another elegant solution is to listen to MySQL binlog events (e.g., using canal). After the database write, the binlog subscriber receives the change and deletes the corresponding cache entry. If the deletion fails, the same retry strategies (retry table, MQ) can be applied. This decouples cache invalidation from the business logic entirely.

High ConcurrencyretryMQConsistencycache invalidation
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.