Why LongAdder Beats AtomicLong in High Contention: Deep Dive into Java’s Concurrent Counter
This article explains how java.util.concurrent.atomic.LongAdder provides a more efficient counting mechanism than AtomicLong under high contention, discusses its trade‑offs in space and precision, presents JMH benchmark results, and walks through the internal implementation details such as cells, CAS loops, and hash‑based distribution.
Introduction
In single‑machine concurrent scenarios we often need to count or accumulate values safely. The usual approaches are synchronized blocks or atomic classes like AtomicLong. The class java.util.concurrent.atomic.LongAdder offers a more efficient solution for these cases.
LongAdder achieves higher throughput at the cost of increased memory usage and the inability to guarantee 100% precise results.
Using LongAdder
Counting is performed with the add() method and the accumulated result is obtained with sum():
LongAdder longAdder = new LongAdder();
longAdder.add(10);
longAdder.add(5);
System.out.println(longAdder.sum()); // prints "15"Performance Comparison: AtomicLong vs LongAdder
The JMH benchmark below compares the throughput of LongAdder and AtomicLong using ten concurrent threads:
/**
* Benchmark Mode Cnt Score Error Units
* MyBenchmark.concurrentAtomicLong thrpt 5 44616374.867 ± 9345004.297 ops/s
* MyBenchmark.concurrentLongAdder thrpt 5 234633786.245 ± 49509534.208 ops/s
*/
public class MyBenchmark {
static LongAdder adder = new LongAdder();
static AtomicLong atomicLong = new AtomicLong();
@Benchmark @Threads(10)
public void concurrentLongAdder(){
adder.add(1);
}
@Benchmark @Threads(10)
public void concurrentAtomicLong(){
atomicLong.incrementAndGet();
}
}The result shows that LongAdder’s write performance is about five times that of AtomicLong under high contention.
Problems with AtomicLong
AtomicLong updates its value using sun.misc.Unsafe.getAndAddLong(), which internally performs a compare‑and‑swap (CAS) loop. As the number of competing threads grows, the probability of CAS retries increases, consuming more CPU resources and eventually becoming a bottleneck. In extreme contention, CAS can be slower than a simple synchronized lock.
Optimization Idea of LongAdder
LongAdder reduces contention by splitting the accumulated value into multiple Cell objects. Each thread hashes to a specific cell, thus spreading the updates. Reading the total requires summing the base value and all cell values.
Implementation Details
Key questions when reading the source:
What members does the class have and what are their roles?
What is the overall flow of the add operation?
How is the target cell for each thread chosen?
What mechanisms ensure thread safety?
When does the cell array expand?
Member Variables
volatile long base– the value used when there is no contention (similar to AtomicLong’s value). volatile Cell[] cells – lazily initialized array of cells; length is a power of two and expands up to the number of CPU cores. volatile int cellsBusy – a simple lock (0/1) used to serialize initialization and resizing of the cells array.
Cell Class
@sun.misc.Contended static final class Cell {
volatile long value;
Cell(long x) { value = x; }
final boolean cas(long cmp, long val) {
return UNSAFE.compareAndSwapLong(this, valueOffset, cmp, val);
}
}The @Contended annotation prevents false sharing between cells.
longAccumulate Method (in Striped64)
The core logic of LongAdder resides in the parent class Striped64. The method repeatedly attempts to update the appropriate cell using CAS; on failure it may resize the cells array or fall back to updating the base value.
final void longAccumulate(long x, LongBinaryOperator fn, boolean wasUncontended) {
int h;
if ((h = getProbe()) == 0) {
ThreadLocalRandom.current();
h = getProbe();
wasUncontended = true;
}
boolean collide = false;
for (;;) {
Cell[] as; Cell a; int n; long v;
if ((as = cells) != null && (n = as.length) > 0) {
if ((a = as[(n - 1) & h]) == null) {
// try to create a new Cell
...
} else if (!wasUncontended) {
wasUncontended = true;
} else if (a.cas(v = a.value, (fn == null) ? v + x : fn.applyAsLong(v, x))) {
break;
} else if (n >= NCPU || cells != as) {
collide = false;
} else if (!collide) {
collide = true;
} else if (cellsBusy == 0 && casCellsBusy()) {
// expand cells array
...
collide = false;
continue;
}
h = advanceProbe(h);
} else if (cellsBusy == 0 && casCellsBusy()) {
// initialize cells array
...
continue;
} else if (casBase(v = base, (fn == null) ? v + x : fn.applyAsLong(v, x))) {
break;
}
}
}cellsBusy CAS
final boolean casCellsBusy() {
return UNSAFE.compareAndSwapInt(this, CELLSBUSY, 0, 1);
}add() and sum() Methods
public void add(long x) {
// delegates to longAccumulate(x, null, false)
}
public long sum() {
Cell[] as = cells;
long sum = base;
if (as != null) {
for (int i = 0; i < as.length; ++i) {
Cell a = as[i];
if (a != null) sum += a.value;
}
}
return sum;
}Both base and Cell.value are volatile, guaranteeing visibility across threads.
Limitations of LongAdder
It does not support incrementAndGet() because the value is spread across multiple cells, making an atomic read‑modify‑write impossible. sum() may return a value that never existed at any single point in time, as concurrent updates can interleave with the summation.
Read‑heavy workloads suffer because reading requires aggregating many cells, whereas LongAdder is optimized for write‑heavy scenarios.
Despite these drawbacks, LongAdder is intentionally designed for high‑contention counting where occasional inexact reads are acceptable.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
