How to Keep Cache and Database Consistent: Invalidate First, Serialize Access
This article explains why cache and database can become inconsistent in distributed systems, why the "invalidate‑cache‑then‑write‑DB" rule is preferred, and presents practical serialization techniques—modifying DB and service connection pools—to guarantee per‑key sequential execution while maintaining availability and load balance.
1 Requirement Origin
The previous article "Cache Architecture Details" sparked discussion and concluded that the safest order is to invalidate the cache before modifying the database because cache and DB operations are not atomic.
2 Why Data Can Become Inconsistent
In a distributed environment multiple services may read and write the same key concurrently. A write request (A) may invalidate the cache first, then fail to update the DB, leaving the cache empty while the DB still holds old data. Conversely, a read request (B) may read an empty cache, fetch a stale value from the DB (because A’s DB write has not completed), and write that stale value back into the cache, causing inconsistency.
3 Inconsistency Optimization Ideas
To guarantee that later reads do not return stale data, the article explores serialization ("串行化") of operations.
Write flow:
Invalidate cache.
Write to DB.
Read flow:
Read cache; if hit, return.
If miss, read DB.
Write DB result back to cache.
Q&A highlights why simple task queues, multiple worker threads, or multiple DB connections cannot ensure serialization across distributed instances.
Key insight: only per‑key serialization is needed, not global request serialization.
4 Can We Ensure the Same Data Hits the Same Service?
By modifying the service connection pool to select a connection based on CPool.GetServiceConnection(longId) (where longId is the data key, e.g., user‑id), all requests for the same key are routed to the same service instance.
Similarly, modify the DB connection pool to CPool.GetDBConnection(longId) so that the same key always uses the same DB connection, guaranteeing sequential execution at the DB level.
5 Summary
Change the service connection pool to select connections by key modulo, ensuring that reads and writes for the same data are handled by the same backend service.
Change the DB connection pool to select connections by key modulo, ensuring that reads and writes for the same data are serialized at the database level.
6 Open Issues
Will key‑based modulo routing affect service availability? No – unhealthy connections are filtered out by the pool.
Will it disturb load balancing? No – with uniformly distributed keys, the modulo selection remains balanced.
How to handle master‑slave replication where reads may go to a replica? This remains an open problem for a future article.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
