Backend Development 36 min read

Deep Dive into Caffeine Cache: High‑Performance Design, W‑TinyLFU Algorithm, and Implementation Details

This article explains Caffeine, a high‑performance Java local cache, comparing it with Guava Cache, detailing its W‑TinyLFU eviction policy, asynchronous read/write buffers, timer‑wheel expiration, and provides extensive source code analysis to illustrate its design and optimization techniques.

Top Architect
Top Architect
Top Architect
Deep Dive into Caffeine Cache: High‑Performance Design, W‑TinyLFU Algorithm, and Implementation Details

Caffeine is a high‑performance, low‑memory‑overhead local cache for Java that builds on and optimizes the concepts of Guava Cache. It is promoted as the "modern cache king" and is the default caching implementation in Spring 5, replacing Guava Cache.

The article begins with a brief comparison between Caffeine and Guava Cache, noting that Spring 5 has switched to Caffeine due to its superior benchmark results for both read and write workloads.

To use Caffeine, developers simply replace the Guava Cache classes (Cache, LoadingCache, CacheLoader, CacheBuilder) with their Caffeine equivalents. An example of creating a cache is provided:

private static LoadingCache
cache = Caffeine.newBuilder()
    // maximum size
    .maximumSize(256L)
    // initial capacity
    .initialCapacity(1)
    // expire after access
    .expireAfterAccess(2, TimeUnit.DAYS)
    // expire after write
    .expireAfterWrite(2, TimeUnit.HOURS)
    // refresh after write
    .refreshAfterWrite(1, TimeUnit.HOURS)
    .recordStats()
    .writer(new CacheWriter
() {
        @Override
        public void write(@NonNull Object key, @NonNull Object value) {
            log.info("key={}, CacheWriter write", key);
        }
        @Override
        public void delete(@NonNull Object key, @Nullable Object value, @NonNull RemovalCause cause) {
            log.info("key={}, cause={}, CacheWriter delete", key, cause);
        }
    })
    .build(new CacheLoader
() {
        @Override
        public String load(@NonNull String key) throws Exception {
            return "value_" + key;
        }
        @Override
        public String reload(@NonNull String key, @NonNull String oldValue) throws Exception {
            return "value_" + key;
        }
    });

The core of Caffeine’s performance lies in its W‑TinyLFU eviction policy. W‑TinyLFU combines a small admission window (a FIFO segment) with a segmented LRU (SLRU) main space that is split into protected and probation regions. New entries first occupy the window; when the window overflows, entries are promoted to the probation segment and compete with existing entries based on a frequency sketch.

The frequency sketch implements a space‑efficient Count‑Min Sketch that approximates access frequencies using a single 64‑bit long per bucket. Incrementing a key’s frequency involves hashing the key with four seeds, updating four 4‑bit counters, and periodically halving all counters to age old accesses. The relevant code is:

public void increment(@NonNull E e) {
    if (isNotInitialized()) return;
    int hash = spread(e.hashCode());
    int start = (hash & 3) << 2;
    int index0 = indexOf(hash, 0);
    int index1 = indexOf(hash, 1);
    int index2 = indexOf(hash, 2);
    int index3 = indexOf(hash, 3);
    boolean added = incrementAt(index0, start);
    added |= incrementAt(index1, start + 1);
    added |= incrementAt(index2, start + 2);
    added |= incrementAt(index3, start + 3);
    if (added && (++size == sampleSize)) {
        reset();
    }
}

private boolean incrementAt(int i, int j) {
    long mask = (0xfL << j);
    if ((table[i] & mask) != mask) {
        table[i] += (1L << j);
        return true;
    }
    return false;
}

private int indexOf(int item, int i) {
    long hash = SEED[i] * item;
    hash += hash >>> 32;
    return ((int) hash) & tableMask;
}

When the cache exceeds its maximum size, the evictEntries method first evicts entries from the window, then from the main space. Candidates from the window compete with the LRU victim in the probation segment using the admit method, which compares their estimated frequencies and applies a small random tie‑breaker to mitigate hash‑collision attacks.

boolean admit(K candidateKey, K victimKey) {
    int victimFreq = frequencySketch().frequency(victimKey);
    int candidateFreq = frequencySketch().frequency(candidateKey);
    if (candidateFreq > victimFreq) return true;
    if (candidateFreq <= 5) return false;
    int random = ThreadLocalRandom.current().nextInt();
    return ((random & 127) == 0);
}

Caffeine dynamically adjusts the size of the admission window using a hill‑climbing algorithm ( climb ) that monitors hit‑rate changes and expands or shrinks the window to adapt to workload characteristics (e.g., OLTP vs. OLAP).

To minimize lock contention during reads and writes, Caffeine employs striped, non‑blocking buffers. Reads are recorded in a BoundedBuffer backed by per‑thread RingBuffer instances, allowing many threads to write without contention while a single consumer drains the buffers asynchronously. Writes use an MPSC (multiple‑producer, single‑consumer) queue ( MpscGrowableArrayQueue ) that guarantees no loss of write operations.

For entries with custom expiration times, Caffeine activates a hierarchical timer‑wheel ( TimerWheel ). Each entry is placed into a bucket based on its absolute expiration time, and the wheel advances in ticks, efficiently expiring entries without scanning the entire cache.

Additional optimizations include optional weak/soft references, elimination of false sharing, and support for asynchronous loading via CompletableFuture . The article concludes that Caffeine’s combination of W‑TinyLFU, high‑throughput buffers, and timer‑wheel expiration delivers near‑optimal hit rates with low memory overhead, making it a superior choice for modern Java applications.

JavaperformanceCacheCaffeineAlgorithmsw-tinylfu
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.