Investigating a Clang -Oz Optimization Bug that Triggers Memory Bloat in a Video Component
The article describes how enabling the aggressive size‑optimisation flag -Oz in Clang caused a video component to create numerous GLFramebuffer and CVPixelBuffer objects, leading to OOM crashes, and explains the underlying ARC and Machine Outliner interactions that exposed a compiler bug which was later fixed in LLVM.
The author, a member of ByteDance's AppHealth team, discovered a compiler defect while using the clang optimisation flag -Oz to reduce binary size for a video component. Enabling this flag caused the component to allocate many GLFramebuffer objects, each holding a 2 MB CVPixelBuffer , resulting in rapid memory growth and OOM crashes.
Investigation revealed that the buffers were not being reused because the -[GLFramebuffer unlock] method was delayed; the objects were only released after the export task finished, accumulating in an autorelease pool. The SampleData object owned the framebuffers and was deallocated via -[SampleData dealloc] , which triggered the unlock.
To pinpoint the release timing, the team added a custom class with -fno-objc-arc to disable ARC and overridden the autorelease method, setting a breakpoint. By swapping the superclass of SampleData to this tracker, they observed that the autorelease occurred inside -[CompileReaderUnit processSampleData:] .
Wrapping the call to [self videoReaderOutput] in an explicit @autoreleasepool eliminated the memory spike, confirming that the ARC optimisation that normally elides the autorelease was being disabled by the -Oz optimisation.
The root cause lies in LLVM's Machine Outliner, which extracts small instruction sequences into separate functions to reduce code size. When the outliner extracts the ARC‑related calls ( objc_msgSend and objc_retainAutoreleasedReturnValue ), the special marker instruction mov x29, x29 that signals the runtime that the returned object will be immediately retained is lost, preventing the ARC optimisation from firing.
Consequently, the object is placed in an autorelease pool, leading to the observed memory bloat. This behaviour is a compiler bug that was recently fixed in LLVM. Similar issues can appear in Swift when ARC optimisations such as -enable-copy-propagation are disabled.
References include LLVM documentation on ARC, the Machine Outliner source, related talks, the bug‑fix commit, and WWDC 21 Session 10216.
Additional promotional material about ByteDance's client‑side infrastructure team and recruitment is present but unrelated to the technical discussion.
ByteDance Terminal Technology
Official account of ByteDance Terminal Technology, sharing technical insights and team updates.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.