Backend Development 19 min read

Mastering Cache Layers: From HTTP to Distributed Systems

This article provides a comprehensive guide to caching technologies, covering HTTP caching, CDN caching, load‑balancer caching, in‑process caching, and distributed caching, while explaining strategies, algorithms, and common pitfalls such as cache avalanche, penetration, and breakdown.

dbaplus Community

Oct 30, 2019

Mastering Cache Layers: From HTTP to Distributed Systems

Introduction

In today’s fast‑moving internet landscape, caching is a ubiquitous solution for performance problems, but a superficial "just add cache" mindset can be misleading. This guide explores the full spectrum of caching techniques across the request chain, from the browser to distributed cache clusters.

1. HTTP Cache

When a browser requests a resource, the HTTP response can be cached to reduce load on the application server. Two main HTTP caching mechanisms are:

Forced cache – The client stores the response until the Expires header or Cache‑Control: max‑age expires. If the cached entry is still valid, the browser serves it without contacting the server.

Conditional cache – The client sends validators such as Last‑Modified/If‑Modified‑Since or ETag/If‑None‑Match. The server replies with 304 Not Modified when the resource has not changed, allowing the browser to reuse its local copy.

Validator Details

Last‑Modified / If‑Modified‑Since – The server provides the last modification timestamp. On subsequent requests the client includes If‑Modified‑Since; the server compares it with its current timestamp and returns either the full resource (200) or a 304 status.

ETag / If‑None‑Match – The server generates a unique hash for each version of a resource. The client stores the ETag and sends it as If‑None‑Match on later requests. A mismatch triggers a full response (200), while a match results in a 304.

2. CDN Cache

CDNs sit between the client and the origin server, caching static assets close to users. The request flow is:

Client resolves the domain via DNS.

DNS directs the request to a CDN edge node.

The edge node serves cached content if available; otherwise it fetches from the origin and caches the response.

This reduces latency and offloads traffic from the origin server, and the same HTTP caching directives (max‑age, ETag, etc.) apply at the CDN level.

3. Load‑Balancer Cache

Load balancers (e.g., Nginx) can also cache responses. If a cached entry exists, the balancer returns it directly; otherwise it forwards the request to the application server and may store the result for future hits. A separate cache‑refresh service can periodically update the balancer’s cache to keep data consistent.

4. In‑Process Cache

Within the application process, caches such as Ehcache, GuavaCache, or Caffeine store hot data in the JVM heap, offering the fastest access but limited capacity. Common eviction policies include FIFO, LRU, and LFU. To maintain consistency across multiple instances, two approaches are typical:

Message‑queue notifications that broadcast cache updates to other services.

Periodic timers that pull fresh data from the database for non‑real‑time‑critical data.

5. Distributed Cache

Distributed caches (e.g., Redis clusters) run as independent services, providing larger capacity and sharing data across multiple application nodes. Key concepts include:

Cache sharding via a proxy (e.g., Twemproxy) to route keys to the correct node.

Master/Slave replication for high availability; a failed master can be promoted to slave.

Persistence snapshots for crash recovery.

Data placement algorithms:

Hash modulo – Simple hash of the key modulo the number of nodes.

Consistent hashing – Maps both keys and nodes onto a ring, minimizing data movement when nodes are added or removed.

Range‑based hashing – Assigns key ranges to nodes, similar to consistent hashing but based on intervals.

6. Cache Reliability Risks

Three major risks must be mitigated:

Cache avalanche – Simultaneous expiration of many keys overwhelms the database. Mitigation: stagger TTLs, use mutexes during refresh, and enable failover mechanisms.

Cache penetration – Repeated queries for non‑existent keys hit the database. Mitigation: cache empty results or use a Bloom filter to filter impossible keys.

Cache breakdown – A hot key expires and is accessed by many concurrent requests. Mitigation: protect the refresh with a mutex so only one request repopulates the cache.

Conclusion

Effective cache design follows a five‑layer strategy: HTTP cache, CDN cache, load‑balancer cache, in‑process cache, and distributed cache. The first two layers handle static content, while the latter three address dynamic data. Understanding eviction policies, sharding algorithms, and reliability safeguards is essential for building high‑performance, resilient systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Distributed Systems performance caching CDN HTTP cache algorithms

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.