Backend Development 15 min read

In-Memory Caching with Guava LoadingCache: Design, Algorithms, and Best Practices

This article explains the principles and practical implementation of in‑memory caching using Guava’s LoadingCache, covering cache initialization parameters, put and loading strategies, eviction policies, common algorithms such as LRU, LFU, FIFO, and tips for avoiding memory issues and monitoring cache performance.

Top Architect
Top Architect
Top Architect
In-Memory Caching with Guava LoadingCache: Design, Algorithms, and Best Practices

In the previous post we discussed buffering; this article introduces its "twin brother" – caching – and explains why it is one of the most widely used optimization techniques in software.

Caching is used to bridge the speed gap between fast components (CPU, memory) and slower ones (disk, remote services). It can dramatically speed up page loads and relieve pressure on databases.

In typical applications caches are divided into in‑process (heap) and out‑of‑process caches. The focus here is on in‑process heap caches, especially those provided by Guava (LoadingCache) and other Java implementations such as JCache, Caffeine, and Guava Cache.

Guava LoadingCache

Guava provides a convenient LoadingCache (LC) for heap caching. It supports size limits, concurrency settings, and automatic loading via CacheLoader .

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>31.1-jre</version>
</dependency>

Typical configuration parameters:

maximumSize : maximum number of entries before eviction.

initialCapacity : initial bucket count (default 16).

concurrencyLevel : number of segments for concurrent access (default 4).

Cache operations can be performed manually with put or automatically via a CacheLoader that lazily loads missing entries.

public static void main(String[] args) {
    LoadingCache
lc = CacheBuilder
        .newBuilder()
        .maximumSize(1000)
        .build(new CacheLoader
() {
            @Override
            public String load(String key) throws Exception {
                return slowMethod(key);
            }
        });
}

static String slowMethod(String key) throws Exception {
    Thread.sleep(1000);
    return key + ".result";
}

Elements can be removed explicitly with invalidate(key) or by attaching a removal listener:

.removalListener(notification -> System.out.println(notification))

Eviction Strategies

When the cache reaches its capacity, one of three strategies is applied:

Size‑based (LRU) : removes the least‑recently‑used entry.

Time‑based : expireAfterWrite(duration) or expireAfterAccess(duration) to discard stale entries.

JVM GC‑based : use weak or soft references ( weakKeys() , weakValues() ) so that the GC can reclaim entries when memory is low.

Interview tip: when both weakKeys() and weakValues() are set, entries are removed as soon as no strong reference to either key or value exists.

Common Cache Algorithms

Three classic eviction algorithms are frequently used:

FIFO – first‑in‑first‑out.

LRU – least‑recently‑used (most common).

LFU – least‑frequently‑used (removes entries with the lowest access count).

Simple LRU with LinkedHashMap

Java’s LinkedHashMap can implement LRU by enabling access‑order and overriding removeEldestEntry :

public class LRU extends LinkedHashMap {
    int capacity;
    public LRU(int capacity) {
        super(16, 0.75f, true);
        this.capacity = capacity;
    }
    @Override
    protected boolean removeEldestEntry(Map.Entry eldest) {
        return size() > capacity;
    }
}

This implementation is simple and not thread‑safe, but illustrates the core idea of LRU caching.

Performance Optimizations

Beyond application‑level caches, operating systems maintain a "cached" memory region (see image) that stores recently accessed file data. Techniques such as read‑ahead (readahead) pre‑load data into this region to reduce random‑read latency.

When cache size is too large, it can cause excessive GC cycles and degrade performance; a balanced size (often achieving >50% hit rate) is recommended.

When to Use a Cache

Data exhibits hot spots and is read far more often than written.

Downstream services have limited capacity or high latency.

Introducing the cache does not compromise correctness or add unmanageable complexity.

Monitoring

Guava’s recordStats() and Caffeine’s built‑in metrics allow you to track hit/miss rates, load times, and eviction counts. Aim for a hit rate above 50%; below 10% indicates the cache may be unnecessary.

Conclusion

The article covered Guava LoadingCache design, common pitfalls leading to memory issues, three classic eviction algorithms, a minimal LRU implementation, and broader system‑level caching concepts such as OS page cache and read‑ahead. Proper monitoring and right‑sized caches can significantly improve application performance.

JavaPerformanceCachingGuavaLRULoadingCache
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.