Fundamentals 18 min read

How Sanitizer Interceptors Detect Memory Bugs in Linux C++ Programs

This article explains the principles behind Google’s sanitizer tools, especially AddressSanitizer, covering symbol interposition, the interceptor mechanism, and how these techniques replace libc functions to reliably locate memory and thread errors in C++ applications on Linux.

ByteDance SYS Tech
ByteDance SYS Tech
ByteDance SYS Tech
How Sanitizer Interceptors Detect Memory Bugs in Linux C++ Programs

Background

For C++ developers, buffer overflows, dangling pointers, data races, deadlocks, and similar errors cause unexpected behavior and affect safety and stability. Quickly locating these issues is difficult. Google’s open‑source sanitizer tools help locate memory and thread errors efficiently, and are widely used in ByteDance’s core services.

Sanitizer Overview

Sanitizer is a set of dynamic analysis tools integrated into Clang and GCC since Clang 3.1 / GCC 4.8. It can detect memory errors (AddressSanitizer), leaks (LeakSanitizer), data races (ThreadSanitizer), undefined behavior, and uninitialized memory. Each sanitizer consists of compile‑time instrumentation and a run‑time library.

ASan Example

ASan inserts checks before every memory access using shadow memory and creates red zones around stack and global variables. Its run‑time library replaces allocation functions (malloc, free, new, delete) with its own allocator, adds quarantine zones, and also intercepts many libc functions such as memcpy, memmove, strcpy, strcat, pthread_create, etc.

Symbol Interposition

Before discussing sanitizer interceptors, we review symbol interposition. By defining a function with the same name as a libc symbol (e.g., malloc) in the application or in a shared library loaded via LD_PRELOAD , the dynamic linker will bind calls to the first definition it encounters. The lookup order is breadth‑first: executable, DT_NEEDED libraries, then their dependencies. When LD_PRELOAD is set, the preload library is searched before the standard libraries, allowing it to replace symbols.

Example ELF symbol‑binding output and a diagram illustrate the search order.

Symbol binding order diagram
Symbol binding order diagram

Sanitizer Interceptor Implementation

The interceptor mechanism is demonstrated by the test file interception_linux_test.cpp . It replaces isdigit with a custom implementation that increments a counter, then verifies the replacement using the macros INTERCEPTOR , INTERCEPT_FUNCTION , DECLARE_REAL , and REAL .

Key steps performed by the INTERCEPTOR macro:

Declare a function‑pointer real_isdigit in the __interception namespace.

Define a weak alias isdigit that points to __interceptor_isdigit .

Implement the custom logic in __interceptor_isdigit .

The INTERCEPT_FUNCTION macro calls __interception::InterceptFunction , which uses dlsym(RTLD_NEXT, name) to obtain the original symbol address and stores it in the real‑function pointer. If RTLD_NEXT fails, it falls back to RTLD_DEFAULT . The weak alias avoids multiple‑definition errors when the sanitizer runtime is statically linked.

Finally, DECLARE_REAL declares the real function pointer, and REAL(isdigit) invokes the original implementation.

Summary

ASan relies on the interceptor mechanism to replace allocation functions such as malloc and free , enabling detection of heap‑use‑after‑free, double‑free, and other memory errors. Understanding these internals helps developers use sanitizers more effectively to locate and fix hard‑to‑debug issues.

C++Dynamic LinkingMemory DebuggingSanitizerAddressSanitizerSymbol Interposition
ByteDance SYS Tech
Written by

ByteDance SYS Tech

Focused on system technology, sharing cutting‑edge developments, innovation and practice, and analysis of industry tech hotspots.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.