Understanding Java LongAdder and Its Underlying Striped64 Implementation
This article explains why LongAdder was introduced to overcome AtomicLong's CAS contention bottleneck, explores the design of Striped64 and its cells, and walks through key source code snippets of LongAdder, DoubleAdder, and the longAccumulate method, highlighting concurrency mechanisms and performance considerations.
In a previous article we introduced AtomicLong ; this piece builds on that foundation to explain why LongAdder was created to address the contention bottleneck of AtomicLong 's CAS‑based operations in high‑concurrency scenarios.
LongAdder extends Striped64 and implements Serializable . Striped64 itself extends Number , providing a 64‑bit accumulator that distributes updates across an array of Cell objects (the cells field) while also maintaining a fallback base value.
The three key transient fields in Striped64 are:
cells : an array whose size is a power of two, holding per‑thread counters.
base : the primary counter used when there is little contention.
cellsBusy : a spin‑lock (implemented with CAS) that protects initialization and resizing of the cells array.
Each Cell mirrors the design of AtomicLong : it holds a volatile long value , uses Unsafe to perform CAS updates, and is annotated with @sun.misc.Contended to avoid false sharing.
The core of the implementation is the longAccumulate method, which distributes updates based on a thread‑local hash probe. A simplified version of the method is shown below:
final void longAccumulate(long x, LongBinaryOperator fn, boolean wasUncontended) {
int h = getProbe();
if (h == 0) {
ThreadLocalRandom.current();
h = getProbe();
wasUncontended = true;
}
boolean collide = false;
for (;;) {
Cell[] as; Cell a; int n; long v;
if ((as = cells) != null && (n = as.length) > 0) {
if ((a = as[(n - 1) & h]) == null) {
if (cellsBusy == 0 && casCellsBusy()) {
try {
if (cells == as && as[(n - 1) & h] == null) {
as[(n - 1) & h] = new Cell(x);
break;
}
} finally { cellsBusy = 0; }
}
collide = false;
} else if (!wasUncontended) {
wasUncontended = true;
} else if (a.cas(v = a.value, fn == null ? v + x : fn.applyAsLong(v, x))) {
break;
} else if (n >= NCPU || cells != as) {
collide = false;
} else if (!collide) {
collide = true;
} else if (cellsBusy == 0 && casCellsBusy()) {
try {
if (cells == as) {
Cell[] rs = new Cell[n << 1];
System.arraycopy(as, 0, rs, 0, n);
cells = rs;
}
} finally { cellsBusy = 0; }
collide = false;
continue;
}
h = advanceProbe(h);
} else if (cellsBusy == 0 && casCellsBusy()) {
try { cells = new Cell[]{ new Cell(x) }; }
finally { cellsBusy = 0; }
break;
} else if (casBase(v = base, fn == null ? v + x : fn.applyAsLong(v, x))) {
break;
}
}
}The method first tries to update the base value; if contention is detected, it initializes or expands the cells array and retries using CAS on the appropriate Cell . The logic mirrors that of ConcurrentHashMap but relies on lightweight CAS instead of heavyweight synchronized locks.
The companion doubleAccumulate method follows the same pattern, converting between double and its raw long bits with Double.doubleToRawLongBits and Double.longBitsToDouble when performing the CAS update.
When no contention exists, LongAdder simply updates the base . Under high contention, updates are spread across the cells array, and the final sum is obtained by adding the values of all cells plus the base. This design yields higher throughput at the cost of using more memory and providing only eventual consistency, as noted in the article’s concluding blockquote.
Overall, the article walks through the motivation, class hierarchy, key fields, and the intricate longAccumulate algorithm that enables LongAdder to achieve superior performance for concurrent counting and statistic‑gathering workloads.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.