Mastering Cache: When, How, and Pitfalls for Backend Developers
This article explains why caching is essential for backend services, outlines common cache problems such as penetration, concurrency, and avalanche, and compares cache-aside, read/write‑through, and write‑back patterns with practical guidance on choosing and updating caches.
Before Starting
Before adding a cache, you must ensure the groundwork is solid; otherwise, a poorly configured cache can degrade performance instead of improving it.
Why Use Cache?
Database access, especially to relational stores, is often the performance bottleneck because reading from disk takes tens of milliseconds, while memory‑based caches are two orders of magnitude faster. However, storing all data in memory is wasteful; focusing on the hot 20% of data yields the best cost‑performance trade‑off.
Cache hit rate is the key metric: a higher hit rate means most requests are served from cache, delivering greater performance gains.
Adding Cache
When using ORM tools, consider the following common cache pitfalls.
Cache Penetration
Penetration occurs when requests target keys that are absent both in cache and database, potentially allowing malicious traffic to hammer the database.
Cache empty objects : Store a short‑lived placeholder for non‑existent keys to prevent repeated DB hits.
Cache prediction : Use a hash or Bloom filter to predict key existence before querying.
Cache Concurrency
Concurrent requests for a missing key can cause multiple threads to load the same data from the DB and set it into the cache, leading to redundant work.
Cache pre‑warming : Proactively load hot data into the cache based on access patterns.
Cache locking : Acquire a lock before loading data and setting the cache, ensuring only one thread performs the load.
Cache Avalanche
An avalanche happens when the cache becomes unavailable, causing all traffic to fall back to the database.
Build a highly available cache cluster : Deploy Redis or Memcached in HA mode.
Rate limiting : Use tools like Netflix Hystrix to protect the backend during cache failures.
Updating Cache
The following sections describe common cache‑update strategies, originally presented by CoolShell’s “左耳朵耗子”.
Cache‑Aside Pattern
This approach updates the database first, then invalidates the cache. An alternative is to invalidate the cache before updating the database, which can lead to stale reads if a read occurs between invalidation and the database write.
Read/Write‑Through Pattern
Read‑Through : On a cache miss, the cache itself fetches data from the DB and stores it, relieving the application from loading data.
Write‑Through : Writes update the cache first; the cache then propagates the change to the DB. If the cache is missed, the write goes directly to the DB.
This pattern simplifies application logic and can improve efficiency, but the cache implementation becomes more complex.
Write‑Back Pattern
Write‑back is the most complex: it marks cached entries as dirty and synchronizes them to persistent storage later. It is suitable when strict consistency is not required.
Read flow: if the key is cached, return it; if not, check a store for a dirty flag, flush if needed, then load data and mark the key as clean.
Write flow: if cached, update and mark dirty; if not cached, check the store, possibly flush dirty data, load, update, and mark dirty.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
