How Linux Guarantees Atomic Operations on ARM: Inside LDREX/STREX Mechanism
This article examines the Linux kernel’s implementation of atomic variables on ARM architectures, detailing how the LDREX and STREX instructions provide atomicity in both UP and SMP systems, analyzing source code from arch/arm/include/asm/atomic.h, and illustrating various concurrency scenarios with diagrams and step‑by‑step explanations.
Background and Motivation
The author revisits Linux’s concurrency control mechanisms to understand how the kernel implements atomic variables on ARM processors. A solid grasp of these low‑level primitives is essential for writing correct drivers and kernel code.
Location and Structure of the Source
The relevant implementation resides in arch/arm/include/asm/atomic.h . The file contains two major branches:
ARMv6 and newer (including v6) – uses the exclusive monitor mechanism with LDREX and STREX instructions.
Pre‑ARMv6 – relies on disabling local CPU interrupts to achieve atomicity.
Exclusive Monitor Mechanism (ARMv6+)
On multi‑core CPUs, ARM introduced exclusive monitors (a local and a global monitor) that allow a pair of instructions to perform an atomic read‑modify‑write sequence. The LDREX instruction marks a memory address as exclusively accessed, and STREX attempts to store the new value, succeeding only if the exclusive mark is still valid.
Detailed Analysis of atomic_add
static inline void atomic_add(int i, atomic_t *v) {
unsigned long tmp;
int result;
__asm__ __volatile__(
"@ atomic_add
"
"1: ldrex %0, [%3]
"
" add %0, %0, %4
"
" strex %1, %0, [%3]
"
" teq %1, #0
"
" bne 1b"
: "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
: "r" (&v->counter), "Ir" (i)
: "cc");
}Explanation of each step:
LDREX %0, [%3] – loads the current counter value into result and sets the exclusive monitor for the address &v->counter.
ADD %0, %0, %4 – adds the increment i to the loaded value.
STREX %1, %0, [%3] – attempts to store the new value; the result (0 = success, non‑zero = failure) is placed in tmp.
TEQ %1, #0 – tests whether the store succeeded.
BNE 1b – if the store failed, the sequence repeats from the LDREX.
The loop guarantees that the addition is performed atomically even if other CPUs or interrupts intervene.
Concurrency Scenarios
The article illustrates three typical cases with diagrams.
1. UP system or SMP system where the variable is not shared between CPUs
Only one CPU accesses the variable, so only the local monitor matters. If an interrupt occurs and also uses LDREX/STREX on the same variable, the interrupt’s operation succeeds while the pre‑empted operation retries.
2. SMP system with shared variable
Both CPUs must coordinate via the global monitor. The CPU that performs LDREX later wins the store; the other retries.
3. Nested interrupt on the same CPU
When an interrupt on the same CPU accesses the same atomic variable, the local monitor is refreshed, allowing the interrupt’s operation to succeed while the original operation retries.
Key Takeaways
The analysis shows that the ARM kernel’s atomic primitives, built on LDREX / STREX, reliably provide atomicity even in the presence of interrupts and across multiple CPUs. Consequently, Linux kernel code that relies on these primitives can be trusted for correct synchronization on ARM platforms.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
