Deep Dive into Caffeine Cache: High‑Performance Design and Source‑Code Analysis
This article explains Caffeine, a high‑performance Java local cache that supersedes Guava Cache, by detailing its design principles such as the W‑TinyLFU eviction algorithm, FrequencySketch implementation, adaptive window sizing, asynchronous read/write buffers, and timer‑wheel expiration, accompanied by extensive code examples.
Caffeine is a high‑performance, near‑optimal local cache for Java, often described as the modern successor to Guava Cache. It offers higher hit rates, lower memory usage, and is the default cache implementation in Spring 5.
Comparison with Guava Cache – Caffeine retains Guava’s core concepts (Cache, LoadingCache, CacheBuilder) making migration trivial, while providing superior benchmark results for both read and write workloads.
Using Caffeine – A typical cache is created with a fluent builder, specifying maximum size, initial capacity, expiration policies, refresh intervals, and a custom CacheWriter for write callbacks. Example:
private static LoadingCache
cache = Caffeine.newBuilder()
// maximum number of entries
.maximumSize(256L)
// initial capacity
.initialCapacity(1)
// expire after access (read or write)
.expireAfterAccess(2, TimeUnit.DAYS)
// expire after write
.expireAfterWrite(2, TimeUnit.HOURS)
// asynchronous refresh after write
.refreshAfterWrite(1, TimeUnit.HOURS)
// record statistics
.recordStats()
// cache write notification callback
.writer(new CacheWriter
() {
@Override
public void write(@NonNull Object key, @NonNull Object value) {
log.info("key={}, CacheWriter write", key);
}
@Override
public void delete(@NonNull Object key, @Nullable Object value, @NonNull RemovalCause cause) {
log.info("key={}, cause={}, CacheWriter delete", key, cause);
}
})
// create a LoadingCache via CacheLoader
.build(new CacheLoader
() {
@Override
public String load(@NonNull String key) throws Exception {
return "value_" + key;
}
@Override
public String reload(@NonNull String key, @NonNull String oldValue) throws Exception {
return "value_" + key;
}
});High‑Performance Design
The core of Caffeine’s performance lies in its W‑TinyLFU eviction policy, which combines a small admission window with a segmented LRU (SLRU) main cache. Traditional LRU suffers from cache pollution, while LFU incurs high update overhead; TinyLFU mitigates these issues using a Count‑Min Sketch to approximate frequencies.
FrequencySketch stores four 4‑bit counters per entry in a single long , allowing frequency updates with minimal memory. The increment method hashes the key, selects four table indices, and increments the corresponding counters, capping at 15. The frequency method returns the minimum counter value across the four indices.
static final long[] SEED = {0xc3a5c85c97cb3127L, 0xb492b66fbe98f273L, 0x9ae16a3b2f90404fL, 0xcbf29ce484222325L};
static final long RESET_MASK = 0x7777777777777777L;
static final long ONE_MASK = 0x1111111111111111L;
int sampleSize;
int tableMask;
long[] table;
int size;
public void increment(@NonNull E e) {
if (isNotInitialized()) return;
int hash = spread(e.hashCode());
int start = (hash & 3) << 2;
int index0 = indexOf(hash, 0);
int index1 = indexOf(hash, 1);
int index2 = indexOf(hash, 2);
int index3 = indexOf(hash, 3);
boolean added = incrementAt(index0, start);
added |= incrementAt(index1, start + 1);
added |= incrementAt(index2, start + 2);
added |= incrementAt(index3, start + 3);
if (added && (++size == sampleSize)) {
reset();
}
}
private boolean incrementAt(int i, int j) {
int offset = j << 2;
long mask = (0xfL << offset);
if ((table[i] & mask) != mask) {
table[i] += (1L << offset);
return true;
}
return false;
}
public int frequency(@NonNull E e) {
if (isNotInitialized()) return 0;
int hash = spread(e.hashCode());
int start = (hash & 3) << 2;
int frequency = Integer.MAX_VALUE;
for (int i = 0; i < 4; i++) {
int index = indexOf(hash, i);
int count = (int) ((table[index] >>> ((start + i) << 2)) & 0xfL);
frequency = Math.min(frequency, count);
}
return frequency;
}
private int indexOf(int item, int i) {
long hash = SEED[i] * item;
hash += hash >>> 32;
return (int) hash & tableMask;
}
static int spread(int x) {
x = ((x >>> 16) ^ x) * 0x45d9f3b;
x = ((x >>> 16) ^ x) * 0x45d9f3b;
return (x >>> 16) ^ x;
}To keep the cache "fresh", Caffeine halves all counters when the total frequency count reaches a threshold, implemented in the reset method.
void reset() {
int count = 0;
for (int i = 0; i < table.length; i++) {
count += Long.bitCount(table[i] & ONE_MASK);
table[i] = (table[i] >>> 1) & RESET_MASK;
}
size = (size >>> 1) - (count >>> 2);
}The admission window (Window Cache) stores newly added entries before they are promoted to the main SLRU cache. When the window overflows, entries are moved to the probation segment, and a "PK" competition between the candidate and the victim decides which entry stays, based on their frequencies.
Eviction is performed in evictEntries , which first evicts from the window, then from the main cache using the admit method that compares candidate and victim frequencies, adding a small random factor to protect against hash‑collision attacks.
boolean admit(K candidateKey, K victimKey) {
int victimFreq = frequencySketch().frequency(victimKey);
int candidateFreq = frequencySketch().frequency(candidateKey);
if (candidateFreq > victimFreq) return true;
if (candidateFreq <= 5) return false;
int random = ThreadLocalRandom.current().nextInt();
return ((random & 127) == 0);
}Caffeine continuously adapts the size of the admission window using a hill‑climbing algorithm ( climb ). It monitors hit‑rate changes, adjusts a step size, and either increases or decreases the window proportion to maintain optimal recency/frequency balance.
Asynchronous Read/Write Buffers – To avoid locking on every cache operation, Caffeine records read and write events in lock‑free striped buffers ( ReadBuffer and WriteBuffer ). Reads are stored in a BoundedBuffer composed of per‑thread RingBuffer s, allowing high concurrency with occasional loss of buffer entries, which is acceptable for statistics. Writes use an MPSC (multiple‑producer, single‑consumer) queue to guarantee that every write is eventually processed.
public int offer(E e) {
long head = readCounter;
long tail = relaxedWriteCounter();
long size = (tail - head);
if (size >= SPACED_SIZE) return Buffer.FULL;
if (casWriteCounter(tail, tail + OFFSET)) {
int index = (int) (tail & SPACED_MASK);
buffer.lazySet(index, e);
return Buffer.SUCCESS;
}
return Buffer.FAILED;
}Timer Wheel – When custom expiration times are required (via .expireAfter ), Caffeine employs a hierarchical timing wheel. Each entry is placed into a bucket based on its absolute expiration time; the wheel advances in ticks, efficiently expiring entries without scanning the entire cache.
public void schedule(@NonNull Node
node) {
Node
sentinel = findBucket(node.getVariableTime());
link(sentinel, node);
}
Node
findBucket(long time) {
long duration = time - nanos;
for (int i = 0; i < wheel.length - 1; i++) {
if (duration < SPANS[i + 1]) {
long ticks = time >>> SHIFT[i];
int index = (int) (ticks & (wheel[i].length - 1));
return wheel[i][index];
}
}
return wheel[wheel.length - 1][0];
}
void link(Node
sentinel, Node
node) {
node.setPreviousInVariableOrder(sentinel.getPreviousInVariableOrder());
node.setNextInVariableOrder(sentinel);
sentinel.getPreviousInVariableOrder().setNextInVariableOrder(node);
sentinel.setPreviousInVariableOrder(node);
}In summary, Caffeine achieves near‑optimal cache performance through the combination of W‑TinyLFU eviction, compact frequency sketching, adaptive window sizing, lock‑free buffers, and a hierarchical timer wheel, all while maintaining a small memory footprint.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.