Designing Multi‑Level Cache Architecture for Microservice Systems

This article explains how to design an effective multi‑level cache architecture for microservice systems, covering client‑side static resource caching, application‑layer CDN and Nginx caching, service‑layer in‑process and distributed caches, consistency challenges, and practical scenarios for using each layer.

Top Architect
Top Architect
Top Architect
Designing Multi‑Level Cache Architecture for Microservice Systems

In this tutorial a senior architect introduces the concept of multi‑level caching in a microservice environment, focusing on three layers: client cache, application‑layer cache, and service‑layer cache.

Multi‑Level Cache Design in Microservices

Caching is the most direct way to improve performance. The article uses a typical e‑commerce scenario where read‑heavy traffic is served from an in‑memory store (Redis) while writes go to MySQL, illustrating the need for read‑write separation and cache‑first strategies.

Client Cache

At the browser level static assets such as images, CSS, JS, and fonts are cached using HTTP response headers like Expires. The example shows how Baidu sets a far‑future expiration date for its logo, allowing browsers to serve the image from local disk cache without contacting the server.

Application‑Layer Cache

Static resources are also cached at the CDN and Nginx levels. A CDN distributes content to edge nodes close to users, reducing latency. The article explains the principle of “smart DNS” that directs requests to the nearest CDN node.

For smaller‑scale applications, Nginx can provide static‑resource caching without a full CDN. The following configuration enables Nginx to cache files matching common extensions, set cache zones, define expiration times, and forward uncached requests to the upstream application servers.

# 设置缓存目录</code>
<code># levels代表采用1:2也就是两级目录的形式保存缓存文件(静态资源css、js)</code>
<code># keys_zone定义缓存的名称及内存的使用,名称为babytun-cache ,在内存中开始100m交换空间</code>
<code># inactive=7d 如果某个缓存文件超过7天没有被访问,则删除</code>
<code># max_size=20g;代表设置文件夹最大不能超过20g,超过后会自动将访问频度(命中率)最低的缓存文件删除</code>
<code>proxy_cache_path d:/nginx-cache levels=1:2 keys_zone=babytun-cache:100m inactive=7d max_size=20g;</code>
<code>upstream xmall {</code>
<code>    server 192.168.31.181 weight=5 max_fails=1 fail_timeout=3s;</code>
<code>    server 192.168.31.182 weight=2;</code>
<code>    server 192.168.31.183 weight=1;</code>
<code>    server 192.168.31.184 weight=2;</code>
<code>}</code>
<code>server {</code>
<code>    listen 80;</code>
<code>    location ~* \.(gif|jpg|css|png|js|woff|html)(.*) {</code>
<code>        proxy_pass http://xmall;</code>
<code>        proxy_set_header Host $host;</code>
<code>        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;</code>
<code>        proxy_cache xmall-cache;</code>
<code>        proxy_cache_valid 200 302 24h;</code>
<code>        proxy_cache_valid 301 5d;</code>
<code>        proxy_cache_valid any 5m;</code>
<code>        expires 90d;</code>
<code>    }</code>
<code>    location / {</code>
<code>        proxy_pass http://xmall;</code>
<code>        proxy_set_header Host $host;</code>
<code>        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;</code>
<code>    }</code>
<code>}

Service‑Layer Cache

Backend services need caching for API responses and data objects. Two main approaches are in‑process caches (e.g., EhCache, Caffeine) and distributed caches (e.g., Redis). In‑process caches store data in the JVM heap for ultra‑low latency, while Redis provides a shared, scalable cache across instances.

When multiple cache layers are used, consistency becomes a challenge. The article suggests using a message queue (RocketMQ) to broadcast data‑change events: after a product price is updated, the service pushes a message to RocketMQ, which notifies other service instances and the Redis cluster to invalidate stale entries and write the new data.

Three typical scenarios where multi‑level caching is beneficial are:

Stable data (e.g., postal codes, historical records) that rarely changes.

Extreme burst traffic (e.g., flash sales, ticket booking) where a warm in‑process cache can absorb spikes before hitting Redis.

Data that can tolerate temporary inconsistency (e.g., user profile updates) where eventual consistency (T+1) is acceptable.

If the application’s read‑write ratio is modest, a single Redis layer may suffice; otherwise, designers should weigh the added complexity against performance gains.

Conclusion

The article summarizes the end‑to‑end cache design for microservice applications, from browser Expires headers through CDN and Nginx static‑resource caching to in‑process and distributed caches at the service layer, highlighting trade‑offs and best‑practice scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

rediscachingCDN
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.