Why the volatile Keyword Matters: Preventing Compiler Optimizations in Multithreaded C Code
This article explains the purpose of C's volatile keyword, how it influences compiler optimizations, its role in multithreaded scenarios, the distinction between visibility and atomicity, and why memory barriers or locks are preferred for proper synchronization.
volatile and the compiler
When learning C you encounter the seemingly mysterious volatile keyword. To illustrate its effect, consider the following code:
int busy = 1;
void wait() {
while (busy) {
;
}
}Compiled with O2 optimization, the generated assembly looks like:
wait:
mov eax, DWORD PTR busy[rip]
.L2:
test eax, eax
jne .L2
ret
busy:
.long 1The loop condition is evaluated by checking the register eax rather than the memory location of busy. This optimization is correct as long as no other thread modifies busy.
If another thread can change busy, the compiler‑generated code will never see the update:
int busy = 1; // thread A
void wait() { while (busy) { ; } }
// thread B
void signal() { busy = 0; }Because the loop only reads the cached value in eax, the change made by signal is invisible.
Marking the variable as volatile forces the compiler to reload it from memory each iteration, producing assembly that repeatedly reads the variable:
wait:
.L2:
mov eax, DWORD PTR busy[rip]
test eax, eax
jne .L2
ret
busy:
.long 1Now each iteration accesses the latest value of busy, ensuring visibility across threads.
volatile and multithreading
While volatile guarantees that a variable’s latest value is read, it does **not** provide atomicity. Consider a complex structure:
struct data { int a; int b; int c; /* ... */ };
volatile struct data foo;
void thread1() { foo.a = 1; foo.b = 2; foo.c = 3; }
void thread2() { int a = foo.a; int b = foo.b; int c = foo.c; }Using volatile ensures thread 2 sees the most recent writes, but it does not prevent race conditions when multiple fields are updated concurrently. Proper synchronization (e.g., mutexes) is required, and a mutex already provides the visibility guarantees that volatile offers, making volatile unnecessary in such cases.
volatile and memory ordering
For simple variables like volatile int busy = 0;, one might think a busy‑wait loop in thread A can reliably detect changes made by thread B. However, modern CPUs employ caches and may reorder memory operations, leading to surprising results.
Example of reordering:
// Thread 1 // Thread 2
X = 10; if (!busy)
busy = 0; Y = X;If the write to busy is reordered after the write to X, thread 2 may observe busy == 0 but still read the old value of X. volatile cannot prevent such reordering; a memory barrier (or stronger synchronization primitives) is needed.
Memory barriers are special CPU instructions that constrain the ordering of memory operations before and after the barrier, eliminating the reordering problem while also providing the visibility guarantees of volatile. Consequently, in multithreaded code you almost never need volatile —use locks or explicit memory barriers instead.
Key takeaways
volatile only prevents the compiler from optimizing away reads/writes to a variable; it does not make accesses atomic, nor does it stop CPU reordering. For proper synchronization in multithreaded programs, prefer mutexes, atomic operations, or explicit memory barriers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
