Detecting and Fixing iOS Memory Leaks with Object‑Graph Scanning
This article explains why iOS memory leaks become critical as apps grow, introduces five representative leak models, details a production‑ready object‑graph scanning solution with custom data structures and a non‑recursive DFS algorithm, and evaluates its performance impact and mitigation strategies.
Introduction
iOS memory leaks are often overlooked, but as business logic expands they cause UI stutters, battery drain, and OOM crashes. The DeWu APM team built a production‑grade leak‑detection solution based on object‑relationship scanning to pinpoint leaking objects precisely.
Memory‑Leak Background
A memory leak occurs when allocated memory cannot be released. Individual leaks may seem harmless, but accumulated leaks exhaust memory, leading to high usage, UI lag, increased power consumption, and ultimately an out‑of‑memory crash.
Common Questions
Why monitor leaks and launch monitoring in production? Because leaks appear in any device or build; a non‑intrusive monitor that automatically covers all business scenarios is needed.
Is full‑sample activation required? No, a limited sample set is sufficient.
Why precise location matters? Accurate pinpointing reduces the code area developers must inspect, shortening fix cycles.
Leak Models
The team identified five typical leak patterns, illustrated with diagrams (images omitted for brevity).
Model 1‑3: Cyclic References
Pages A‑E form cycles that can be discovered by scanning from the page node, allowing direct identification of the leaking objects.
Model 4‑5: System‑Object References
Leaks are referenced by system objects, making direct cycle detection impossible, but analysis of the reference chain still reveals the exact leak location (C in Model 4, A in Model 5).
Technical Solution
Leak Definition
An action that repeats indefinitely and eventually triggers OOM is considered a leak. The solution tracks such actions and the objects they retain.
Detection Timing
Leak checks run when a page is popped from the navigation stack.
Scanning Strategy
The monitor builds a directed graph where each page is the root vertex and edges represent object references. Strong references are kept; weak references are filtered out (Objective‑C via runtime, Swift via reflection when possible).
Key Cases
Objective‑C: runtime provides reference type (strong/weak) enabling edge filtering.
Swift: lack of strong‑weak distinction requires confirming a leak before assuming a strong‑reference cycle.
Data Structures & Algorithm
The graph is stored as a sparse matrix using a cross‑linked list (十字链表), which offers O(n+e) traversal.
typedef struct EdgeNode { // arc node
int tailvex; // tail vertex index
int headvex; // head vertex index
struct EdgeNode *headlink; // next arc with same head
struct EdgeNode *taillink; // next arc with same tail
EdgeType info; // optional weight/info
} EdgeNode;
typedef struct VertexNode { // vertex node
int index; // vertex index
VertexType data; // payload
EdgeNode *firstout; // first outgoing arc
EdgeNode *firstin; // first incoming arc
} VertexNode;
typedef struct OLGraph { // cross‑linked list graph
VertexNode vertices[MaxVex];
int vexnum; // number of vertices
int arcnum; // number of arcs
} OLGraph;The core algorithm performs a non‑recursive depth‑first search using a stack to detect cycles and break them, achieving optimal O(n+e) complexity.
while (!is_stack_empty(&S)) {
int index = Top(&S);
EdgeNode *node = nodeCache[index] ? nodeCache[index] : g->vertices[index].firstout;
EdgeNode *edgeCycle[MaxVex];
while (node) {
if (node->visit == 1) {
if (cycle) { break; }
} else {
addObject(node, edgeCycle);
}
count++;
if (visited[node->headvex] != 1) {
Push(&S, node->headvex);
node = g->vertices[node->headvex].firstout;
} else {
node = node->taillink;
}
}
int j = -1; Pop(&S, &j); visited[j] = 1;
}Confirming a Leak
If the reference‑graph contains a cycle whose objects are never released, that cycle marks the leak location.
Limitations
Singleton objects forming cycles do not increase memory usage and should be filtered out.
Model 2 and Model 4 are not covered when only partial scans are performed.
Optimization Strategies
By comparing successive scans, the system can ignore unchanged cycles and focus on newly introduced leaks. Reporting thresholds (sample size, device class, per‑page scan count, max scan duration) reduce overhead.
Performance Evaluation
Scan Duration
Typical pages on DeWu finish scanning within 1 second; outliers reach up to ~4 seconds.
0.573888
0.556977
1.401240
0.791259
1.368312
3.868116
0.841532
0.607546
0.911085
0.831673
0.441535
3.720887Memory Overhead
The monitor temporarily retains the scanned objects, extending their lifetime by the scan duration, which correlates with device performance.
CPU Impact
Heavy memory reads cause brief CPU spikes. Enabling the scan prolongs the spike by a few seconds while the graph is built and traversed.
Special Cases
Audio/video pages may continue playback after exit because the page object is held during scanning, leading to audible artifacts proportional to scan time.
Mitigation
Do not enable full‑sample scanning on low‑end devices.
Limit the number of scans per page.
Cap the maximum scan duration.
Conclusion
Scanning object reference cycles and comparing before/after graphs effectively detects all leak models except the exhaustive Model 2, providing precise leak locations with acceptable performance overhead when configured with sensible thresholds.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
