Backend Development 37 min read

Deep Dive into Caffeine Cache: High‑Performance Design and Source‑Code Analysis

This article explains Caffeine, a high‑performance Java local cache that supersedes Guava Cache, by detailing its design principles such as the W‑TinyLFU eviction algorithm, FrequencySketch implementation, adaptive window sizing, asynchronous read/write buffers, and timer‑wheel expiration, accompanied by extensive code examples.

Architecture Digest

Dec 17, 2020

Deep Dive into Caffeine Cache: High‑Performance Design and Source‑Code Analysis

Caffeine is a high‑performance, near‑optimal local cache for Java, often described as the modern successor to Guava Cache. It offers higher hit rates, lower memory usage, and is the default cache implementation in Spring 5.

Comparison with Guava Cache – Caffeine retains Guava’s core concepts (Cache, LoadingCache, CacheBuilder) making migration trivial, while providing superior benchmark results for both read and write workloads.

Using Caffeine – A typical cache is created with a fluent builder, specifying maximum size, initial capacity, expiration policies, refresh intervals, and a custom CacheWriter for write callbacks. Example:

private static LoadingCache<String, String> cache = Caffeine.newBuilder()
    // maximum number of entries
    .maximumSize(256L)
    // initial capacity
    .initialCapacity(1)
    // expire after access (read or write)
    .expireAfterAccess(2, TimeUnit.DAYS)
    // expire after write
    .expireAfterWrite(2, TimeUnit.HOURS)
    // asynchronous refresh after write
    .refreshAfterWrite(1, TimeUnit.HOURS)
    // record statistics
    .recordStats()
    // cache write notification callback
    .writer(new CacheWriter<Object, Object>() {
        @Override
        public void write(@NonNull Object key, @NonNull Object value) {
            log.info("key={}, CacheWriter write", key);
        }
        @Override
        public void delete(@NonNull Object key, @Nullable Object value, @NonNull RemovalCause cause) {
            log.info("key={}, cause={}, CacheWriter delete", key, cause);
        }
    })
    // create a LoadingCache via CacheLoader
    .build(new CacheLoader<String, String>() {
        @Override
        public String load(@NonNull String key) throws Exception {
            return "value_" + key;
        }
        @Override
        public String reload(@NonNull String key, @NonNull String oldValue) throws Exception {
            return "value_" + key;
        }
    });

High‑Performance Design

The core of Caffeine’s performance lies in its W‑TinyLFU eviction policy, which combines a small admission window with a segmented LRU (SLRU) main cache. Traditional LRU suffers from cache pollution, while LFU incurs high update overhead; TinyLFU mitigates these issues using a Count‑Min Sketch to approximate frequencies.

FrequencySketch stores four 4‑bit counters per entry in a single long, allowing frequency updates with minimal memory. The increment method hashes the key, selects four table indices, and increments the corresponding counters, capping at 15. The frequency method returns the minimum counter value across the four indices.

static final long[] SEED = {0xc3a5c85c97cb3127L, 0xb492b66fbe98f273L, 0x9ae16a3b2f90404fL, 0xcbf29ce484222325L};
static final long RESET_MASK = 0x7777777777777777L;
static final long ONE_MASK = 0x1111111111111111L;

int sampleSize;
int tableMask;
long[] table;
int size;

public void increment(@NonNull E e) {
    if (isNotInitialized()) return;
    int hash = spread(e.hashCode());
    int start = (hash & 3) << 2;
    int index0 = indexOf(hash, 0);
    int index1 = indexOf(hash, 1);
    int index2 = indexOf(hash, 2);
    int index3 = indexOf(hash, 3);
    boolean added = incrementAt(index0, start);
    added |= incrementAt(index1, start + 1);
    added |= incrementAt(index2, start + 2);
    added |= incrementAt(index3, start + 3);
    if (added && (++size == sampleSize)) {
        reset();
    }
}

private boolean incrementAt(int i, int j) {
    int offset = j << 2;
    long mask = (0xfL << offset);
    if ((table[i] & mask) != mask) {
        table[i] += (1L << offset);
        return true;
    }
    return false;
}

public int frequency(@NonNull E e) {
    if (isNotInitialized()) return 0;
    int hash = spread(e.hashCode());
    int start = (hash & 3) << 2;
    int frequency = Integer.MAX_VALUE;
    for (int i = 0; i < 4; i++) {
        int index = indexOf(hash, i);
        int count = (int) ((table[index] >>> ((start + i) << 2)) & 0xfL);
        frequency = Math.min(frequency, count);
    }
    return frequency;
}

private int indexOf(int item, int i) {
    long hash = SEED[i] * item;
    hash += hash >>> 32;
    return (int) hash & tableMask;
}

static int spread(int x) {
    x = ((x >>> 16) ^ x) * 0x45d9f3b;
    x = ((x >>> 16) ^ x) * 0x45d9f3b;
    return (x >>> 16) ^ x;
}

To keep the cache "fresh", Caffeine halves all counters when the total frequency count reaches a threshold, implemented in the reset method.

void reset() {
    int count = 0;
    for (int i = 0; i < table.length; i++) {
        count += Long.bitCount(table[i] & ONE_MASK);
        table[i] = (table[i] >>> 1) & RESET_MASK;
    }
    size = (size >>> 1) - (count >>> 2);
}

The admission window (Window Cache) stores newly added entries before they are promoted to the main SLRU cache. When the window overflows, entries are moved to the probation segment, and a "PK" competition between the candidate and the victim decides which entry stays, based on their frequencies.

Eviction is performed in evictEntries, which first evicts from the window, then from the main cache using the admit method that compares candidate and victim frequencies, adding a small random factor to protect against hash‑collision attacks.

boolean admit(K candidateKey, K victimKey) {
    int victimFreq = frequencySketch().frequency(victimKey);
    int candidateFreq = frequencySketch().frequency(candidateKey);
    if (candidateFreq > victimFreq) return true;
    if (candidateFreq <= 5) return false;
    int random = ThreadLocalRandom.current().nextInt();
    return ((random & 127) == 0);
}

Caffeine continuously adapts the size of the admission window using a hill‑climbing algorithm ( climb). It monitors hit‑rate changes, adjusts a step size, and either increases or decreases the window proportion to maintain optimal recency/frequency balance.

Asynchronous Read/Write Buffers – To avoid locking on every cache operation, Caffeine records read and write events in lock‑free striped buffers ( ReadBuffer and WriteBuffer). Reads are stored in a BoundedBuffer composed of per‑thread RingBuffer s, allowing high concurrency with occasional loss of buffer entries, which is acceptable for statistics. Writes use an MPSC (multiple‑producer, single‑consumer) queue to guarantee that every write is eventually processed.

public int offer(E e) {
    long head = readCounter;
    long tail = relaxedWriteCounter();
    long size = (tail - head);
    if (size >= SPACED_SIZE) return Buffer.FULL;
    if (casWriteCounter(tail, tail + OFFSET)) {
        int index = (int) (tail & SPACED_MASK);
        buffer.lazySet(index, e);
        return Buffer.SUCCESS;
    }
    return Buffer.FAILED;
}

Timer Wheel – When custom expiration times are required (via .expireAfter), Caffeine employs a hierarchical timing wheel. Each entry is placed into a bucket based on its absolute expiration time; the wheel advances in ticks, efficiently expiring entries without scanning the entire cache.

public void schedule(@NonNull Node<K, V> node) {
    Node<K, V> sentinel = findBucket(node.getVariableTime());
    link(sentinel, node);
}

Node<K, V> findBucket(long time) {
    long duration = time - nanos;
    for (int i = 0; i < wheel.length - 1; i++) {
        if (duration < SPANS[i + 1]) {
            long ticks = time >>> SHIFT[i];
            int index = (int) (ticks & (wheel[i].length - 1));
            return wheel[i][index];
        }
    }
    return wheel[wheel.length - 1][0];
}

void link(Node<K, V> sentinel, Node<K, V> node) {
    node.setPreviousInVariableOrder(sentinel.getPreviousInVariableOrder());
    node.setNextInVariableOrder(sentinel);
    sentinel.getPreviousInVariableOrder().setNextInVariableOrder(node);
    sentinel.setPreviousInVariableOrder(node);
}

In summary, Caffeine achieves near‑optimal cache performance through the combination of W‑TinyLFU eviction, compact frequency sketching, adaptive window sizing, lock‑free buffers, and a hierarchical timer wheel, all while maintaining a small memory footprint.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Cache Caffeine W‑TinyLFU FrequencySketch TimerWheel

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.