Fundamentals 11 min read

How Go 1.24’s New Spinning Mutex Boosts Performance by Up to 70%

The article explains the background of the Go mutex performance proposal, details the new spinning flag added to the mutex state, walks through fast‑path, spinning, and sleep phases of lock acquisition, presents benchmark results showing up to 70% speed‑up, and provides references for further reading.

Radish, Keep Going!

Feb 11, 2025

How Go 1.24’s New Spinning Mutex Boosts Performance by Up to 70%

Background

Rhys Hiltner proposed a mutex performance improvement in 2024 [1]; the optimization has been merged into the upcoming Go 1.24 release and can increase performance by up to 70% in highly contended scenarios.

In the ChanContended benchmark the author observed that increasing GOMAXPROCS caused mutex performance to degrade sharply. On an Intel i7‑13700H (linux/amd64):

With 4 threads the process throughput is half of the single‑threaded case.

With 8 threads the throughput halves again.

With 12 threads the throughput halves once more.

At GOMAXPROCS=20, 200 channel operations take an average of 44 µs, invoking unlock2 about every 220 ns and waking a sleeping thread each time. Over a 1.78 s wall‑clock interval, the 20 threads spend a total of 27.74 s in CPU (spinning) during lock2 calls.

source https://github.com/golang/go/issues/68578

New Proposal: Add Spinning State

Analysis shows that the current lock2 implementation allows threads to sleep in theory, but in practice all threads spin, causing slower lock hand‑off and high CPU consumption. The author therefore submitted the design "Proposal: Improve scalability of runtime.lock2" [2].

Core Optimizations

The mutex state word now includes a new flag called spinning .

const (
    mutexLocked   = 0x001
    mutexSleeping = 0x002
    mutexSpinning = 0x100
    ...
)

The spinning bit indicates that a thread is "awake and actively trying to acquire the lock". Threads compete for the spinning state but do not block while attempting to set the flag.

Details on mutex can be found in earlier articles: https://pub.huizhou92.com/go-source-code-sync-mutex-3082a25ef092 [3]

Mutex Lock Acquisition Analysis

1. Fast Path Attempt to Acquire Lock

//https://github.com/golang/go/blob/adc9c455873fef97c5759e4811f0d9c8217fe27b/src/runtime/lock_spinbit.go#L160
k8 := key8(&l.key)
v8 := atomic.Xchg8(k8, mutexLocked)
if v8&mutexLocked == 0 {
    if v8&mutexSleeping != 0 {
        atomic.Or8(k8, mutexSleeping)
    }
    return
}

The fast path behaves similarly to previous versions: if the lock is free it returns immediately, representing the ideal case with no contention.

2. Spinning Wait Phase

//https://github.com/golang/go/blob/adc9c455873fef97c5759e4811f0d9c8217fe27b/src/runtime/lock_spinbit.go#L208
if !weSpin && v&mutexSpinning == 0 && atomic.Casuintptr(&l.key, v, v|mutexSpinning) {
    v |= mutexSpinning
    weSpin = true
}
if weSpin || atTail || mutexPreferLowLatency(l) {
    if i < spin {
        procyield(mutexActiveSpinSize) // active spin
    } else if i < spin+mutexPassiveSpinCount {
        osyield() // passive spin
    }
}

If the fast path fails, execution enters the spinning phase.

The mutexSpinning flag ensures that only one goroutine spins at a time.

Active spin ( procyield) keeps the CPU busy for very short waits, while passive spin ( osyield) yields the CPU for longer waits, balancing latency and CPU usage.

Light contention uses active spin for low latency; heavy contention quickly switches to passive spin to avoid wasting CPU cycles.

Sleep Wait Phase

//https://github.com/golang/go/blob/adc9c455873fef97c5759e4811f0d9c8217fe27b/src/runtime/lock_spinbit.go#L231
// Store the current head of the list of sleeping Ms in our gp.m.mWaitList.next field
gp.m.mWaitList.next = mutexWaitListHead(v)
// Pack a (partial) pointer to this M with the current lock state bits
next := (uintptr(unsafe.Pointer(gp.m)) &^ mutexMMask) | v&mutexMMask | mutexSleeping
if weSpin {
    next = next &^ mutexSpinning
}
if atomic.Casuintptr(&l.key, v, next) {
    weSpin = false
    semasleep(-1)
    atTail = gp.m.mWaitList.next == 0
    i = 0
}

If spinning fails, the goroutine sleeps, is added to the wait list, and is woken up by a semaphore when the lock is released.

The runtime uses a spinbit design: when a thread is in the "awake‑and‑spinning" state, other threads are not woken, reducing contention and unnecessary context switches.

Results

goos: linux
goarch: amd64
pkg: runtime
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
                │     old     │          new          │
                │ sec/op  │ sec/op  vs base │
ChanContended   3.147µ ± 0%   3.703µ ± 0%   +17.65% (p=0.000 n=10)
... (omitted intermediate rows) ...
geomean         17.60µ        12.46µ        -29.22%

Although performance may drop slightly under low contention, the changes deliver significant gains under heavy contention, averaging about a 29% improvement.

The mutex modification does not affect the API; it becomes active automatically with Go 1.24. The feature can be toggled with GOEXPERIMENT=spinbitmutex, which is enabled by default.

References

[1] Improvement proposal: https://github.com/golang/go/issues/68578

[2] Proposal: Improve scalability of runtime.lock2 – https://github.com/golang/proposal/blob/master/design/68578-mutex-spinbit.md

[3] https://pub.huizhou92.com/go-source-code-sync-mutex-3082a25ef092

[4] mutexSpinning – https://github.com/golang/go/blob/608acff8479640b00c85371d91280b64f5ec9594/src/runtime/lock_spinbit.go#L60

[5] semasleep – https://github.com/golang/go/blob/fd050b3c6d0294b6d72adb014ec14b3e6bf4ad60/src/runtime/lock_sema_tristate.go#L106

[6] https://github.com/golang/go/issues/68578#issuecomment-2256792628

concurrency Go Runtime mutex

Written by

Radish, Keep Going!

Personal sharing

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.