Designing Effective Multi‑Level Cache Architecture for Microservices

This article explains how to build a multi‑level caching system for microservice applications, covering client‑side HTTP caching, CDN and Nginx static‑resource caching, in‑process and distributed Redis caches, consistency challenges, and practical guidelines for when such a design is beneficial.

Architect
Architect
Architect
Designing Effective Multi‑Level Cache Architecture for Microservices

In microservice environments, caching is the most direct way to boost performance. The article starts by illustrating a typical read‑heavy scenario where primary data resides in MySQL while over 90% of reads are served from an in‑memory store such as Redis, reducing disk I/O latency.

Client‑Side Cache

Browsers cache static assets (images, CSS, JS, fonts) using the Expires HTTP header. For example, Baidu sets the logo image's expiration to 2031‑02‑08 09:26:31, allowing the browser to serve the file from local disk without contacting the server, which dramatically cuts bandwidth for high‑traffic web apps.

Application‑Layer Cache

Beyond the browser, static resources are cached at the application layer via CDNs and reverse proxies like Nginx. CDNs replicate content to edge nodes based on intelligent DNS, so a Shanghai user requesting banner.jpg is routed to a nearby CDN server, which either serves a cached copy or fetches it from the origin and caches it for future requests.

CDN providers (e.g., Alibaba Cloud, Tencent Cloud) also allow custom response headers. The article contrasts Expires (absolute timestamp) with Cache‑Control (relative max‑age), showing how to set a one‑hour cache period in Alibaba Cloud’s console.

The Nginx configuration example demonstrates enabling static‑resource caching and setting different cache lifetimes based on response codes:

# Set cache directory
proxy_cache_path d:/nginx-cache levels=1:2 keys_zone=babytun-cache:100m inactive=7d max_size=20g;

# Backend server pool
upstream xmall {
    server 192.168.31.181 weight=5 max_fails=1 fail_timeout=3s;
    server 192.168.31.182 weight=2;
    server 192.168.31.183 weight=1;
    server 192.168.31.184 weight=2;
}

server {
    listen 80;
    location ~* \.(gif|jpg|css|png|js|woff|html)(.*){
        proxy_pass http://xmall;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_cache xmall-cache;
        proxy_cache_valid 200 302 24h;
        proxy_cache_valid 301 5d;
        proxy_cache_valid any 5m;
        expires 90d;
    }
    location / {
        proxy_pass http://xmall;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

After adding this configuration, Nginx stores newly requested static files locally; subsequent requests within the cache period are served directly from the Nginx cache, bypassing the backend.

Service‑Layer Cache

Service‑side caching is split into in‑process caches (e.g., Java’s EhCache, Caffeine) and distributed caches (e.g., Redis). In‑process caches keep frequently accessed data in the JVM heap, while Redis provides a shared memory store for multiple service instances.

The article warns against naïvely adding only a Redis layer. In a flash‑sale scenario, relying solely on Redis can overload the cluster when traffic spikes, leading to failures. A combined approach first checks EhCache; if a miss occurs, it falls back to Redis, and finally to the database. Successful reads update both EhCache and Redis, ensuring the next request hits the fast in‑process cache.

Cache Consistency

Introducing multiple cache layers creates consistency challenges. When a product price changes, the article proposes using RocketMQ to broadcast an invalidation message. All service instances receive the message, delete the stale entry, and repopulate the cache with the updated data, maintaining coherence across caches.

When to Adopt Multi‑Level Caching

The author outlines three scenarios suitable for multi‑level caching:

Stable data (e.g., postal codes) where read traffic dominates and data rarely changes.

Extreme burst traffic (e.g., Spring Festival ticketing, Double‑11 flash sales) where a single cache tier could become a bottleneck.

Use‑cases tolerating eventual consistency (e.g., personal profile updates) where temporary divergence is acceptable.

If an application’s concurrency is modest and a single Redis cluster can meet performance needs for the next one to two years, the extra complexity of multi‑level caching may be unnecessary.

Conclusion

The article summarizes that a well‑designed cache hierarchy—from browser Expires headers, through CDN and Nginx static caching, to in‑process and distributed Redis caches—can dramatically improve microservice performance, but designers must weigh the added operational complexity and consistency overhead against the specific workload characteristics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendMicroservicesrediscachingCDNNGINXConsistency
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.