How Baidu’s Dex Line‑Number Optimization Shrinks Android APKs by 8%
This article examines Android Dex DebugInfo line‑number optimization, detailing the structure of DebugInfo, existing solutions, Baidu’s mapping‑based approach, R8 and Alipay techniques, and the resulting APK size reduction, while also describing the end‑to‑end pipeline for line‑number retracing in production.
Background
In the previous article we introduced the basic ideas for Android package size optimization. This follow‑up focuses on Dex line‑number optimization, aiming to reduce the size of the DebugInfo section while preserving traceability of original source lines.
DebugInfo Structure
DebugInfo is the bytecode information used for debugging, containing source file names, line numbers, local variables, and extended debug data. Reducing the line‑number information directly shrinks the DebugInfo region and thus the overall Dex file size.
2.1 Dex DebugInfo Layout
In a Dex file, DebugInfo resides in the data section as a series of debug_info_item entries. Each debug_info_item corresponds one‑to‑one with a method and consists of a header (start line, parameter count, parameter names) followed by a list of debug_event records that encode PC offsets and line offsets.
2.2 DebugInfo Usage Scenarios
DebugInfo is primarily used for breakpoint debugging and stack trace reconstruction (including crash, ANR, and memory analysis). When an exception occurs, the VM resolves the method’s PC to a line number via the DebugInfo, which is then displayed in the stack trace.
// Simplified flow of converting native stack trace to StackTraceElement[]
static jobjectArray Throwable_nativeGetStackTrace(JNIEnv* env, jclass, jobject javaStackState) {
// ...
return Thread::InternalStackTraceToStackTraceElementArray(soa, javaStackState);
}The GetLineNumForPc implementation iterates over the debug_info_item to find the matching line number. Our optimization sets pcDelta to 1, which slightly increases iteration length but has negligible performance impact.
Existing Solutions
3.1 Extreme Optimization
Removing all DebugInfo eliminates line numbers in stack traces (they appear as -1). This is acceptable only for highly stable apps where debugging is unnecessary.
3.2 Mapping‑Based Optimization
Instead of removing DebugInfo, we keep it but change the relationship from one‑to‑one (method → debug_info_item) to many‑to‑one, allowing multiple methods to share the same debug_info_item. This reduces the number of items and thus the size. A mapping file records the original line numbers for later retracing.
Equality of two debug_info_item objects is defined as:
public boolean equals(DebugInfoItem other) {
return this.startLine == other.startLine &&
this.parameters.equals(other.parameters) &&
this.events.equals(other.events);
}Equality of individual debug_event objects is:
public boolean equals(DebugEvent other) {
return this.type == other.type && this.value == other.value;
}To enable reuse, we must keep startLine, the number of debug_event entries, and each event’s type, lineDelta, and pcDelta identical across methods.
3.3 Overloaded Method Line‑Number Overlap Issue
When two overloaded methods share the same start line range after mapping, the VM cannot disambiguate which method a stack‑trace line belongs to because the stack only contains the method name, not its signature.
// Example of overlapping mapping
com.example.MethodOverloadSample.test(): 1 → 21
com.example.MethodOverloadSample.test(String): 1 → 34
// Stack trace only shows method name
at com.example.MethodOverloadSample.test(MethodOverloadSample.java:2)Alipay Line‑Number Optimization
Solution 1
Extract all DebugInfo into a separate debugInfo.dex file, removing it from the main APK.
At crash time, hook Throwable to capture the instruction‑level line number and upload it.
The performance platform uses the uploaded debugInfo.dex to translate the instruction line back to the original source line.
This approach works only for Throwable‑based crashes and requires handling different JVM versions.
Solution 2
Keep a limited number of debug_info_item entries and treat their line numbers as instruction offsets (both lineDelta and pcDelta are set to 1). This makes every debug_event a special opcode, allowing full reuse of the item.
R8 Line‑Number Optimization
R8 preserves line numbers with -keepattributes LineNumberTable and then applies two modifications:
startLine : default 1; for overloaded methods the next method’s startLine becomes the previous method’s endLine + 1.
lineDelta : forced to 1.
Because debug_event count and pcDelta remain uncontrolled, reuse is limited.
Baidu App Implementation
Baidu’s solution controls four variables to maximize reuse:
startLine : default 100 000 to avoid overlap in hot‑fix or plugin scenarios.
debug_event count : set to the number of bytecode instructions in the method.
pcDelta : first special opcode is 0, subsequent ones are 1.
lineDelta : identical to pcDelta, so the line number reported is actually the instruction offset.
The generated mapping file has the format:
com.baidu.searchbox.Application:
void onCreate(android.os.Bundle):
[1000-1050] -> 20
[1051-2000] -> 22
void onCreate():
[3000-3020] -> 30
[3021-3033] -> 31The Gradle plugin processes the Dex files before the packageApplication task, feeding the optimized Dex back into the build pipeline.
Performance Impact
Before optimization Baidu APK size was 123.58 MiB (Dex 37.42 MiB). After applying line‑number optimization the APK reduced to 120.54 MiB, saving 3.04 MiB (≈8 % of Dex size). Further class‑level line‑number allocation could save an additional ~400 KiB.
Server‑Side Line‑Number Retracing
After deployment, the app reports virtual line numbers. The performance platform receives these numbers together with the mapping file and restores the original source lines.
The pipeline consists of:
Upload of the mapping file during release (manual or CI‑driven).
Parsing service that stores the mapping in a multi‑level cache (in‑memory, Redis, and a persistent table).
Streaming computation that, for each crash event, looks up the virtual line number in the cache and outputs the real line number for downstream analytics.
The cache uses a W‑TinyLFU eviction policy and expires entries that have not been accessed for a configurable number of days, ensuring hot mapping data stays in memory for millisecond‑level lookup.
Conclusion
This article presented the anatomy of Dex DebugInfo, surveyed existing removal and mapping strategies, detailed Baidu’s line‑number optimization and its integration into the build system, quantified the APK size savings, and described the full server‑side retracing pipeline that turns virtual line numbers back into actionable source locations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
