Inside Caffeine Cache: TinyLFU, Count‑Min Sketch, MPSC Queues and Multi‑Threaded Eviction
This article walks through the inner workings of Caffeine's fixed‑size cache, explaining its TinyLFU eviction policy, Count‑Min Sketch frequency tracking, MPSC write buffers, multi‑threaded design patterns, and the core Java classes that coordinate window, probation and protected segments for high‑performance caching.
The article adopts a "total‑detail‑total" structure to introduce Caffeine's fixed‑size cache eviction strategy. It first explains the implementation principles, then dives into source code details, covering Count‑Min Sketch, memory barriers, cache‑line padding, MPSC multi‑threaded design, high‑performance cache design ideas, and coordination between threads.
Caffeine Cache Overview
The cache stores data in a ConcurrentHashMap and creates three regions—window, probation, and protected—each backed by an LRU deque. When the hit rate changes, the window and protected region sizes adjust automatically. Eviction uses TinyLFU, which records access frequencies with a Count‑Min Sketch that offers ~93.75% accuracy while using minimal memory.
Read and write operations add tasks to ReadBuffer and WriteBuffer respectively. Both buffers follow an MPSC (multiple‑producer, single‑consumer) pattern. Maintenance methods drain these buffers under a lock, following a WAL‑style approach: log the operation first, then execute it asynchronously.
Constructor and Class Hierarchy
The source analysis starts with the constructor to illustrate the important data structures and fields created during cache initialization. The key classes are BoundedLocalManualCache (bounded cache) and UnboundedLocalManualCache (unbounded cache). The isBounded() method determines which implementation is used based on size, weight, expiration, and other policies.
public final class Caffeine<K, V> {
static final int UNSET_INT = -1;
public Cache<K1, V1> build() {
requireWeightWithWeigher();
requireNonLoadingCache();
return isBounded()
? new BoundedLocalCache.BoundedLocalManualCache<>(self)
: new UnboundedLocalCache.UnboundedLocalManualCache<>(self);
}
boolean isBounded() {
return (maximumSize != UNSET_INT) || (maximumWeight != UNSET_INT)
|| (expireAfterAccessNanos != UNSET_INT) || (expireAfterWriteNanos != UNSET_INT)
|| (expiry != null) || (keyStrength != null) || (valueStrength != null);
}
}When a bounded cache is created, the constructor of BoundedLocalCache initializes many fields, including write‑buffer limits, eviction lock, weigher, executor, and the three deques.
protected BoundedLocalCache(Caffeine<K, V> builder, @Nullable CacheLoader<? super K, V> loader, boolean async) {
this.isAsync = async;
this.cacheLoader = loader;
executor = builder.getExecutor();
isWeighted = builder.isWeighted();
evictionLock = new ReentrantLock();
weigher = builder.getWeigher(isAsync);
drainBuffersTask = new PerformCleanupTask(this);
nodeFactory = NodeFactory.newFactory(builder, isAsync);
evictionListener = builder.getEvictionListener(isAsync);
data = new ConcurrentHashMap<>(builder.getInitialCapacity());
readBuffer = evicts() || collectKeys() || collectValues() || expiresAfterAccess()
? new BoundedBuffer<>() : Buffer.disabled();
accessPolicy = (evicts() || expiresAfterAccess()) ? this::onAccess : e -> {};
writeBuffer = new MpscGrowableArrayQueue<>(WRITE_BUFFER_MIN, WRITE_BUFFER_MAX);
if (evicts()) {
setMaximumSize(builder.getMaximum());
}
}setMaximumSize
This method calculates the window, protected, and main region capacities based on percentages (99% main, 80% protected) and updates the internal counters used for dynamic size adjustments.
void setMaximumSize(long maximum) {
long max = Math.min(maximum, MAXIMUM_CAPACITY);
long window = max - (long) (PERCENT_MAIN * max);
long mainProtected = (long) (PERCENT_MAIN_PROTECTED * (max - window));
setMaximum(max);
setWindowMaximum(window);
setMainProtectedMaximum(mainProtected);
setHitsInSample(0);
setMissesInSample(0);
setStepSize(-HILL_CLIMBER_STEP_PERCENT * max);
if ((frequencySketch() != null) && !isWeighted() && (weightedSize() >= (max >>> 1))) {
frequencySketch().ensureCapacity(max);
}
}Class Naming Convention (SSMS)
Caffeine uses a compact abbreviation scheme for implementation classes. For a cache with strong keys, strong values, no weight, and a maximum size, the class name becomes SSMS. The pattern is S|W S|I [L] [S] [MW|MS] [A] [W] [R], where each letter represents a configuration option.
Example diagram:
FrequencySketch
The FrequencySketch implements a Count‑Min Sketch. It stores a long[] table where each 4‑bit slice acts as a counter (max value 15). The table size is the nearest power‑of‑two to the cache's maximum capacity. Hash functions spread keys uniformly, and the minimum of four counters represents the element's frequency, achieving a 93.75% hit‑rate while using little memory.
final class FrequencySketch<E> {
static final long RESET_MASK = 0x7777777777777777L;
static final long ONE_MASK = 0x1111111111111111L;
int sampleSize;
int blockMask;
long[] table;
int size;
public FrequencySketch() {}
public void ensureCapacity(long maximumSize) {
int maximum = (int) Math.min(maximumSize, Integer.MAX_VALUE >>> 1);
if (table != null && table.length >= maximum) return;
table = new long[Math.max(Caffeine.ceilingPowerOfTwo(maximum), 8)];
sampleSize = (maximumSize == 0) ? 10 : (int) (10 * maximumSize);
blockMask = (table.length >>> 3) - 1;
if (sampleSize <= 0) sampleSize = Integer.MAX_VALUE;
size = 0;
}
public int frequency(E e) {
if (isNotInitialized()) return 0;
int[] count = new int[4];
int blockHash = spread(e.hashCode());
int counterHash = rehash(blockHash);
int block = (blockHash & blockMask) << 3;
for (int i = 0; i < 4; i++) {
int h = counterHash >>> (i << 3);
int index = (h >>> 1) & 15;
int offset = h & 1;
count[i] = (int) ((table[block + offset + (i << 1)] >>> (index << 2)) & 0xfL);
}
return Math.min(Math.min(count[0], count[1]), Math.min(count[2], count[3]));
}
public void increment(E e) {
if (isNotInitialized()) return;
int[] index = new int[8];
int blockHash = spread(e.hashCode());
int counterHash = rehash(blockHash);
int block = (blockHash & blockMask) << 3;
for (int i = 0; i < 4; i++) {
int h = counterHash >>> (i << 3);
index[i] = (h >>> 1) & 15;
index[i + 4] = block + (h & 1) + (i << 1);
}
boolean added = incrementAt(index[4], index[0])
| incrementAt(index[5], index[1])
| incrementAt(index[6], index[2])
| incrementAt(index[7], index[3]);
if (added && (++size == sampleSize)) reset();
}
// ... reset, spread, rehash omitted for brevity ...
}put(K key, V value)
The put method first inserts the entry into the ConcurrentHashMap. If the key already exists, the method acquires a lock on the existing node, updates weight, expiration times, and possibly creates an AddTask or UpdateTask that is placed into the WriteBuffer. After a successful insertion, afterWrite schedules maintenance.
public @Nullable V put(K key, V value) {
return put(key, value, expiry(), false);
}
V put(K key, V value, Expiry<K, V> expiry, boolean onlyIfAbsent) {
requireNonNull(key);
requireNonNull(value);
Node<K, V> node = null;
long now = expirationTicker().read();
int newWeight = weigher.weigh(key, value);
Object lookupKey = nodeFactory.newLookupKey(key);
for (int attempts = 1; ; attempts++) {
Node<K, V> prior = data.get(lookupKey);
if (prior == null) {
if (node == null) {
node = nodeFactory.newNode(key, keyReferenceQueue(), value, valueReferenceQueue(), newWeight, now);
setVariableTime(node, expireAfterCreate(key, value, expiry, now));
}
prior = data.putIfAbsent(node.getKeyReference(), node);
if (prior == null) {
afterWrite(new AddTask(node, newWeight));
return null;
}
// fall‑through to handle existing key
}
if (!prior.isAlive()) {
// spin‑wait for removal
if ((attempts & MAX_PUT_SPIN_WAIT_ATTEMPTS) != 0) {
Thread.onSpinWait();
continue;
}
data.computeIfPresent(lookupKey, (k, n) -> { requireIsAlive(key, n); return n; });
continue;
}
synchronized (prior) {
if (!prior.isAlive()) continue;
V oldValue = prior.getValue();
int oldWeight = prior.getWeight();
long varTime;
boolean expired = false;
boolean mayUpdate = true;
if (oldValue == null) {
varTime = expireAfterCreate(key, value, expiry, now);
notifyEviction(key, null, RemovalCause.COLLECTED);
} else if (hasExpired(prior, now)) {
expired = true;
varTime = expireAfterCreate(key, value, expiry, now);
notifyEviction(key, oldValue, RemovalCause.EXPIRED);
} else if (onlyIfAbsent) {
mayUpdate = false;
varTime = expireAfterRead(prior, key, value, expiry, now);
} else {
varTime = expireAfterUpdate(prior, key, value, expiry, now);
}
if (mayUpdate) {
prior.setValue(value, valueReferenceQueue());
prior.setWeight(newWeight);
setWriteTime(prior, now);
discardRefresh(prior.getKeyReference());
}
setVariableTime(prior, varTime);
setAccessTime(prior, now);
// notify listeners based on the path taken
if (expired) {
notifyRemoval(key, oldValue, RemovalCause.EXPIRED);
} else if (oldValue == null) {
notifyRemoval(key, null, RemovalCause.COLLECTED);
} else if (mayUpdate) {
notifyOnReplace(key, oldValue, value);
}
int weightedDiff = mayUpdate ? (newWeight - oldWeight) : 0;
if (oldValue == null || weightedDiff != 0 || expired) {
afterWrite(new UpdateTask(prior, weightedDiff));
} else if (!onlyIfAbsent && exceedsTolerance) {
afterWrite(new UpdateTask(prior, weightedDiff));
} else {
if (mayUpdate) setWriteTime(prior, now);
afterRead(prior, now, false);
}
return expired ? null : oldValue;
}
}
}WriteBuffer (MPSC GrowableArrayQueue)
The write buffer is a multi‑producer, single‑consumer queue. Producers add AddTask or UpdateTask objects without locking, using CAS and a spin‑wait loop. The consumer thread drains the queue under the evictionLock and executes each task.
class MpscGrowableArrayQueue<E> extends MpscChunkedArrayQueue<E> {
MpscGrowableArrayQueue(int initialCapacity, int maxCapacity) {
super(initialCapacity, maxCapacity);
}
}
abstract class BaseMpscLinkedArrayQueue<E> extends AbstractQueue<E> {
private static final Object JUMP = new Object();
public boolean offer(final E e) {
if (e == null) throw new NullPointerException();
while (true) {
long producerLimit = lvProducerLimit();
long pIndex = lvProducerIndex(this);
if ((pIndex & 1) == 1) continue; // resize in progress
long mask = this.producerMask;
E[] buffer = this.producerBuffer;
if (producerLimit <= pIndex) {
int result = offerSlowPath(mask, pIndex, producerLimit);
switch (result) {
case 0: break;
case 1: continue;
case 2: return false;
case 3: resize(mask, buffer, pIndex, e); return true;
}
}
if (casProducerIndex(this, pIndex, pIndex + 2)) {
long offset = modifiedCalcElementOffset(pIndex, mask);
soElement(buffer, offset, e);
return true;
}
}
}
// ... offerSlowPath, resize, and other methods omitted for brevity ...
}afterWrite and Scheduling
After a task is placed into the write buffer, afterWrite tries to enqueue the task up to 100 times. If the buffer is full, it calls scheduleDrainBuffers, which acquires evictionLock and submits a PerformCleanupTask to the executor (by default the common ForkJoinPool). If enqueuing still fails, the method falls back to a synchronous call to maintenance(task) while holding the lock.
void afterWrite(Runnable task) {
for (int i = 0; i < WRITE_BUFFER_RETRIES; i++) {
if (writeBuffer.offer(task)) {
scheduleAfterWrite();
return;
}
scheduleDrainBuffers();
Thread.onSpinWait();
}
// Fallback: execute synchronously under lock
lock();
try {
maintenance(task);
} finally {
evictionLock.unlock();
}
rescheduleCleanUpIfIncomplete();
}
void scheduleAfterWrite() {
int drainStatus = drainStatusOpaque();
while (true) {
switch (drainStatus) {
case IDLE:
casDrainStatus(IDLE, REQUIRED);
scheduleDrainBuffers();
return;
case REQUIRED:
scheduleDrainBuffers();
return;
case PROCESSING_TO_IDLE:
if (casDrainStatus(PROCESSING_TO_IDLE, PROCESSING_TO_REQUIRED)) return;
drainStatus = drainStatusAcquire();
continue;
case PROCESSING_TO_REQUIRED:
return;
default:
throw new IllegalStateException("Invalid drain status: " + drainStatus);
}
}
}PerformCleanupTask
The task holds a weak reference to the cache and, when run, calls cache.performCleanUp(null). The cleanup method acquires evictionLock and invokes maintenance, which processes read/write buffers, reference queues, expiration, eviction, and the hill‑climber size‑adjustment algorithm.
static final class PerformCleanupTask extends ForkJoinTask<Void> implements Runnable {
private final WeakReference<BoundedLocalCache<?, ?>> reference;
PerformCleanupTask(BoundedLocalCache<?, ?> cache) {
reference = new WeakReference<>(cache);
}
public boolean exec() {
try { run(); } catch (Throwable t) { logger.log(Level.ERROR, "Exception in maintenance task", t); }
return false; // allow re‑submission
}
public void run() {
BoundedLocalCache<?, ?> cache = reference.get();
if (cache != null) cache.performCleanUp(null);
}
}UpdateTask
UpdateTaskupdates the node after a write, handling expiration policies, eviction, and moving the node between the window, probation, and protected deques. It also updates the global weighted size and may trigger a size‑based eviction if the total weight exceeds the maximum.
final class UpdateTask implements Runnable {
final int weightDifference;
final Node<K, V> node;
UpdateTask(Node<K, V> node, int weightDifference) {
this.node = node;
this.weightDifference = weightDifference;
}
@GuardedBy("evictionLock")
public void run() {
if (expiresAfterWrite()) reorder(writeOrderDeque(), node);
else if (expiresVariable()) timerWheel().reschedule(node);
if (evicts()) {
int oldPolicyWeight = node.getPolicyWeight();
node.setPolicyWeight(oldPolicyWeight + weightDifference);
if (node.inWindow()) {
setWindowWeightedSize(windowWeightedSize() + weightDifference);
if (node.getPolicyWeight() > maximum()) {
evictEntry(node, RemovalCause.SIZE, expirationTicker().read());
} else if (node.getPolicyWeight() <= windowMaximum()) {
onAccess(node);
} else if (accessOrderWindowDeque().contains(node)) {
accessOrderWindowDeque().moveToFront(node);
}
} else if (node.inMainProbation()) {
if (node.getPolicyWeight() <= maximum()) onAccess(node);
else evictEntry(node, RemovalCause.SIZE, expirationTicker().read());
} else if (node.inMainProtected()) {
setMainProtectedWeightedSize(mainProtectedWeightedSize() + weightDifference);
if (node.getPolicyWeight() <= maximum()) onAccess(node);
else evictEntry(node, RemovalCause.SIZE, expirationTicker().read());
}
setWeightedSize(weightedSize() + weightDifference);
if (weightedSize() > MAXIMUM_CAPACITY) evictEntries();
} else if (expiresAfterAccess()) {
onAccess(node);
}
}
}onAccess
When a node is accessed, its frequency is incremented in the FrequencySketch. Depending on which region the node belongs to, it is moved to the tail of the appropriate deque or promoted from probation to protected.
@GuardedBy("evictionLock")
void onAccess(Node<K, V> node) {
if (evicts()) {
K key = node.getKey();
if (key == null) return;
frequencySketch().increment(key);
if (node.inWindow()) reorder(accessOrderWindowDeque(), node);
else if (node.inMainProbation()) reorderProbation(node);
else reorder(accessOrderProtectedDeque(), node);
setHitsInSample(hitsInSample() + 1);
} else if (expiresAfterAccess()) {
reorder(accessOrderWindowDeque(), node);
}
if (expiresVariable()) timerWheel().reschedule(node);
}Conclusion
The article demonstrates how Caffeine combines several advanced techniques—TinyLFU with a Count‑Min Sketch, lock‑free MPSC queues, cache‑line padding to avoid false sharing, and a multi‑region LRU hierarchy—to achieve high‑throughput, low‑latency local caching. Understanding these components provides a solid theoretical basis for selecting or implementing a local cache in Java applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
