Backend Development 13 min read

Mastering Cache Strategies: From Local to Multi‑Level for High‑Performance Systems

This article shares a decade of practical experience with caching, covering local page and object caches, distributed solutions like Redis and Memcached, pagination caching techniques, multi‑level cache architectures, refresh mechanisms, and lessons learned from real‑world performance optimizations.

macrozheng
macrozheng
macrozheng
Mastering Cache Strategies: From Local to Multi‑Level for High‑Performance Systems

A senior architect once said, “Nginx + business‑logic layer + database + cache layer + message queue” fits most scenarios. Over ten years the author explored cache technologies—from local caches to distributed and multi‑level caches—and distilled key insights.

01 Local Cache

Page‑level cache

In 2010 the author used OSCache in JSP pages. Example:

<code>&lt;cache:cache key="foobar" scope="session"&gt;
    some jsp content
&lt;/cache:cache&gt;</code>

The snippet caches content with key "foobar" in the session, allowing other pages to share it. While page‑level cache was common in monolithic JSP apps, front‑end separation reduced its usage, though it remains popular on the front‑end.

Object cache

Inspired by a 2011 article about Ehcache handling millions of requests on a single server, the author applied Ehcache to cache order status, avoiding repeated calls to the payment service. Example:

<code>cache.put(orderId, orderStatus);
// subsequent queries read from cache</code>

Object cache offers finer granularity than page cache, ideal for rarely changing data such as global configs or completed orders.

Refresh strategies

In 2018 a custom configuration center used Guava for local caching. Update mechanisms include:

Client‑side scheduled tasks pulling data from the config center.

Push notifications via RocketMQ when data changes.

Zookeeper watch: admin writes full data to Zookeeper; web services watch nodes and update local cache on change.

WebSocket: admin pushes full data on connection, then incremental updates.

HTTP long‑polling using Servlet 3.0 async to notify of changes.

02 Distributed Cache

Memcached and Redis are the most common choices. Two case studies illustrate challenges:

Controlling object size and read strategy

In a lottery live‑score service (2013), large cached objects (300‑500 KB) caused frequent Young GC pauses. By converting JSON arrays to compact arrays, cache size dropped from ~300 KB to ~80 KB, reducing GC frequency and speeding up responses.

<code>[{ "playId":"2399", "guestTeamName":"小牛", "hostTeamName":"湖人", "europe":"123" }]</code>

Further optimization split cache into full data and incremental data, delivering updates via WebSocket.

Pagination list cache

Two approaches:

Cache the entire page list using a key composed of page number and size.

Cache individual items. Workflow:

<code>-- Query IDs for the current page
select id from blogs limit 0,10;
-- Batch get cached items; collect missing IDs
-- Fetch missing items from DB
select id from blogs where id in (noHitId1,noHitId2);
-- Store missing items in cache
-- Return the assembled list</code>

This method works well when data changes infrequently (e.g., rankings) or for typical pagination scenarios.

03 Multi‑Level Cache

Why use multiple cache layers? Local cache is ultra‑fast but limited in capacity; distributed cache scales but can saturate bandwidth under high concurrency. The principle: “The closer the cache to the user, the more efficient.”

In a 2018 e‑commerce app, a two‑level cache was built: Guava lazy‑loading for local cache and Redis for distributed cache. The refresh flow:

On startup, if local cache is empty, read from Redis; if Redis miss, fetch via RPC and populate both caches.

Subsequent requests read directly from local cache.

Guava’s refresh thread periodically syncs data from the service to both caches.

Issues observed: inconsistent data across servers due to lazy loading and an undersized thread pool causing backlog.

Solutions:

Combine lazy loading with a message‑driven push to update caches when configuration changes.

Increase the LoadingCache thread‑pool size and add monitoring/alerts for thread saturation.

Overall, mastering cache—from local to multi‑level—significantly improves system throughput, reduces latency, and eases pressure on distributed caches.

Cache is a crucial technique; deep understanding from theory to practice is one of the most rewarding pursuits for engineers.

distributed systemsPerformance Optimizationbackend developmentcachingmulti-level cache
macrozheng
Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.