Fundamentals 9 min read

Why False Sharing Slows Your Java Programs and How to Eliminate It

False sharing occurs when multiple threads modify variables that reside on the same CPU cache line, causing unnecessary cache coherency traffic; this article explains cache line basics, CPU cache hierarchy, MESI protocol, and presents Java solutions—including padding, @sun.misc.Contended annotation, and JVM flags—to prevent performance degradation.

JavaEdge
JavaEdge
JavaEdge
Why False Sharing Slows Your Java Programs and How to Eliminate It

What Is False Sharing

In modern CPUs the cache is organized in 64‑byte cache lines. When several threads write to different variables that happen to reside on the same cache line, each write invalidates the line in the other cores, causing extra cache‑coherency traffic. This phenomenon is called false sharing.

Cache Line

A cache line is the smallest unit that the CPU can load from or store to memory. Accessing any address loads the whole 64‑byte line into the cache, and subsequent accesses to other bytes within the same line are cheap.

CPU Cache Hierarchy

Most CPUs have three levels of cache: L1 (smallest and fastest, private to a core), L2 (larger, still core‑private), and L3 (largest, shared by all cores on a socket). Data not found in any cache level is fetched from main memory, which is much slower.

CPU cache hierarchy illustration
CPU cache hierarchy illustration

Cache Associativity

Most caches are N‑way set‑associative. The cache is divided into sets, each containing N cache lines. An address maps to a particular set, and can occupy any line within that set. For example, a 2‑way set‑associative cache has two lines per set.

2‑Way set associative cache diagram
2‑Way set associative cache diagram

MESI Protocol

To keep caches coherent, modern CPUs implement the MESI protocol. Each cache line can be in one of four states, represented by two bits:

M (Modified) : line is dirty and present only in this cache.

E (Exclusive) : line is clean and present only in this cache.

S (Shared) : line is clean and may be present in multiple caches.

I (Invalid) : line is not valid.

When one core modifies a shared line, it transitions to M and other cores’ copies become I, causing the line to be re‑fetched later.

MESI state transition example
MESI state transition example

How to Eliminate False Sharing

The simplest remedy is to pad each frequently updated variable so that it occupies its own cache line. This trades a small amount of memory for reduced cache‑coherency traffic.

Traditional Java Solutions (pre‑Java 8)

import java.util.concurrent.atomic.AtomicLong;

public final class FalseSharing implements Runnable {
    public static final int NUM_THREADS = 4;
    public static final long ITERATIONS = 500L * 1000L * 1000L;
    private final int arrayIndex;
    private static VolatileLong[] longs = new VolatileLong[NUM_THREADS];

    static {
        for (int i = 0; i < longs.length; i++) {
            longs[i] = new VolatileLong();
        }
    }

    public FalseSharing(final int arrayIndex) {
        this.arrayIndex = arrayIndex;
    }

    public static void main(final String[] args) throws Exception {
        long start = System.nanoTime();
        runTest();
        System.out.println("duration = " + (System.nanoTime() - start));
    }

    private static void runTest() throws InterruptedException {
        Thread[] threads = new Thread[NUM_THREADS];
        for (int i = 0; i < threads.length; i++) {
            threads[i] = new Thread(new FalseSharing(i));
        }
        for (Thread t : threads) t.start();
        for (Thread t : threads) t.join();
    }

    public void run() {
        long i = ITERATIONS + 1;
        while (0 != --i) {
            longs[arrayIndex].set(i);
        }
    }

    // Padding to prevent optimisation (JDK 7+)
    public static final class VolatileLong {
        public volatile long value = 0L;
        public long p1, p2, p3, p4, p5, p6; // padding
    }
}

For JDK 7 and later, the JVM includes an optimisation that can reduce false sharing; for earlier versions developers manually added padding fields.

Java 8 Built‑in Solution

Java 8 introduced the @sun.misc.Contended annotation. When a class is annotated and the JVM is started with -XX:-RestrictContended, the VM automatically pads the fields to separate cache lines.

@sun.misc.Contended
public final static class VolatileLong {
    public volatile long value = 0L;
    // additional padding fields are added by the JVM
}
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance Optimizationfalse sharingJava concurrencycache lineMESI
JavaEdge
Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.