Fundamentals 24 min read

Mastering Cache Strategies: From CDN to Local Memory for High‑Performance Systems

This comprehensive guide explains cache fundamentals, covering CDN, reverse‑proxy, distributed, and local caches, their design considerations, common pitfalls like consistency, high availability, cache avalanche and penetration, and provides practical architecture examples and mitigation techniques for robust high‑traffic systems.

dbaplus Community

Feb 4, 2017

Mastering Cache Strategies: From CDN to Local Memory for High‑Performance Systems

1. Cache Overview

Cache is a critical component in distributed systems, addressing performance challenges of high concurrency and hot‑data access by storing data in faster storage close to the application or user.

Write/read data to faster storage devices.

Place cached data near the application.

Place cached data near the user.

2. CDN Cache

CDN caches static resources (HTML, scripts, images, videos) at edge locations to reduce latency and network congestion, especially across ISPs.

Principle: Distributed cache servers receive user requests and, via global load‑balancing, direct them to the nearest healthy cache server.

Before CDN deployment, a request traverses three network nodes (client → ISP → application data center) and six steps; after deployment, only two nodes and two steps are needed, cutting one node and four steps, dramatically improving response speed.

Advantages: Faster local cache, mirroring service, remote acceleration, bandwidth optimization, DDoS resistance.

Disadvantages: Dynamic resource caching needs careful real‑time handling; consistency and freshness must be balanced.

Solutions include setting expiration times, versioning, and using asynchronous refresh.

CDN request/response flow before and after deployment

3. Reverse Proxy Cache

Reverse proxies (e.g., Varnish, Nginx, Squid) sit in front of web servers, caching static resources and forwarding dynamic requests, thereby reducing backend load.

Cache control relies on HTTP headers:

Last-Modified

Expires

Cache-Control

Pragma (e.g., no‑cache)

Typical acceleration flow: DNS round‑robin directs clients to a proxy; if the proxy has the resource, it returns it directly; otherwise it fetches from a neighbor proxy or the origin server, caches the result, and serves the client.

4. Distributed Cache

Distributed caches (e.g., Memcached, Redis) store hot data from databases to alleviate DB pressure.

Memcached

High‑performance, in‑memory key‑value store using a hash table. Features include:

Memory‑only storage, each process up to 2 GB; scale by adding processes or servers.

O(1) lookup via key‑value.

Simple text protocol (telnet usable).

Libevent‑based networking for high concurrency.

LRU eviction; no persistence (data lost on restart).

Distributed behavior achieved by client‑side hashing.

Workflow: check cache → return if hit; otherwise query DB, return result, and write back to cache. Consistent hashing or modulo‑based algorithms distribute keys across servers.

Redis

Open‑source, in‑memory data store supporting multiple data structures (strings, hashes, lists, sets, sorted sets) and features such as replication, Lua scripting, LRU eviction, transactions, persistence, Sentinel, and Cluster for high availability.

Common commands: SET, GET, INCR, MGET, HSET, HGETALL, LPUSH, LRANGE, SADD, ZADD etc.

Typical use cases: string cache (similar to Memcached), hash for user objects, list for timelines, set for deduplication, sorted set for time‑ordered feeds.

Redis hash storing user profile

High‑availability solutions include keepalived‑based master‑slave failover and Twitter’s Twemproxy (supports both Memcached and Redis) for connection pooling and consistent hashing.

5. Local Cache

Application‑level cache residing on the same host, using either disk or memory. Disk cache reduces network I/O for less latency‑critical data; memory cache offers the fastest access for frequently used objects.

6. Cache Architecture Example

Typical multi‑layer cache hierarchy:

CDN – caches static assets (HTML, CSS, JS, images).

Reverse Proxy – separates static and dynamic content, caching static resources.

Distributed Cache – stores hot database data.

Local Cache – caches application‑specific dictionaries and objects.

Request flow:

Browser → CDN (if hit, return).

Otherwise → Reverse Proxy (if hit, return).

If miss, forward to application server.

Application checks Local Cache; if hit, return.

If miss, check Distributed Cache; if hit, store in Local Cache and return.

If still miss, read from DB, populate Distributed and Local caches.

7. Common Cache Issues

Data Consistency

Inconsistent data can arise from write‑through order (cache then DB or DB then cache) or asynchronous refreshes. Solutions include writing to the persistent store first, rolling back on cache failure, read‑through fallback to DB, and setting appropriate expiration policies.

High Availability

Achieved via distributed deployment and replication; consistent hashing distributes load, while asynchronous replication ensures redundancy.

Cache Avalanche

Massive cache expiration floods the DB. Mitigation: staggered TTLs, load‑shedding, rate limiting, and multi‑level caches.

Cache Penetration

Repeated queries for non‑existent keys overload the DB. Mitigation: cache empty results temporarily and use Bloom filters to pre‑filter impossible keys.

Cache Snowball (Cascade) Failure

Combining the above techniques—proper TTL planning, capacity sizing, and layered caching—helps prevent cascading failures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

CDN memcache Cache Design

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.