Why Does __dispatch_barrier_waiter_redirect_or_wake Crash on iOS 14‑16? A Deep GCD Investigation
The article analyses a recurring crash in an iOS driver app caused by the __dispatch_barrier_waiter_redirect_or_wake function in libdispatch, explains the EXC_BREAKPOINT/SIGTRAP mechanism, compares libdispatch versions, reveals a premature queue release due to non‑atomic reference‑count handling, and proposes replacing GCD‑based barriers with pthread_rwlock_t to eliminate the bug.
Background
The driver side of an iOS app crashes repeatedly in _dispatch_barrier_waiter_redirect_or_wake. The crash stack shows EXC_BREAKPOINT (SIGTRAP) originating from libdispatch.dylib on iOS 14.0 – 16.2 devices, specifically at instruction offset +256.
Root‑Cause Investigation
EXC_BREAKPOINT (SIGTRAP)is raised by the system when Grand Central Dispatch (GCD) detects an unrecoverable internal error. The stack points to __dispatch_barrier_waiter_redirect_or_wake + 256. Disassembly of the surrounding instructions shows an atomic decrement ( ldaddl) on a reference‑count field:
0 libdispatch.dylib __dispatch_barrier_waiter_redirect_or_wake + 256
1 libdispatch.dylib __dispatch_lane_invoke + 764
2 libdispatch.dylib __dispatch_workloop_worker_thread + 648
3 libsystem_pthread.dylib __pthread_wqthread + 288
ldr w8, [x21, #0x8] // load refcnt
mov w9, #0x7fffffff // global refcnt sentinel
cmp w8, w9
b.eq normal_exit // skip decrement if global
add x8, x21, #0x8 // address of refcnt field
mov w9, #-0x2 // decrement by 2
ldaddl w9, w8, [x8] // atomic subtract
cmp w8, #0x1
b.gt normal_exit // if previous value >1, normal exit
// otherwise crash due to under‑flowThe analysis indicates that x21 (the queue object dq) is likely already deallocated, making the address used by ldaddl invalid.
Version Comparison
Reference‑count read/write became fully atomic starting with libdispatch‑1462.0.4 (released in 2023 for iOS 17). Earlier versions such as libdispatch‑1271.100.5 (used in iOS 14‑16) performed non‑atomic reads, allowing a race where the refcnt could be observed as zero and the queue freed prematurely.
GitHub source: https://github.com/apple-oss-distributions/libdispatch/tags?after=libdispatch-1173.100.2
Hypothesis
Many business‑side classes implement read‑write locks using GCD barriers (e.g., dispatch_barrier_sync, queue.sync(flags: .barrier)). The non‑atomic reference‑count read in older libdispatch can cause the queue’s refcnt to reach zero under concurrent access, leading to early deallocation and the observed crash.
Solution
The most reliable fix is to replace GCD‑based barrier locks with POSIX read‑write locks ( pthread_rwlock_t), which provide explicit mutex, read‑counter, and condition‑variable components and guarantee atomicity.
Mutex : protects internal state.
Read Counter : tracks the number of readers.
Condition Variable : wakes waiting readers or writers.
After migrating the safety classes to pthread_rwlock_t, crash frequency dropped dramatically; the issue disappeared in newer releases and only occasional occurrences remained in legacy builds.
Conclusion
The crash on iOS 14‑16 stems from a non‑atomic reference‑count operation inside GCD’s barrier handling, causing premature queue release. Updating libdispatch (as Apple did in iOS 16.3/17) or replacing GCD barriers with pthread_rwlock_t eliminates the problem. The analysis is based on source‑level inspection and runtime observations; official Apple documentation does not describe this bug.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
