Mobile Development 23 min read

How Baidu’s Dex Line‑Number Optimization Shrinks Android APKs by 8%

This article examines Android Dex DebugInfo line‑number optimization, detailing the structure of DebugInfo, existing solutions, Baidu’s mapping‑based approach, R8 and Alipay techniques, and the resulting APK size reduction, while also describing the end‑to‑end pipeline for line‑number retracing in production.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
How Baidu’s Dex Line‑Number Optimization Shrinks Android APKs by 8%

Background

In the previous article we introduced the basic ideas for Android package size optimization. This follow‑up focuses on Dex line‑number optimization, aiming to reduce the size of the DebugInfo section while preserving traceability of original source lines.

DebugInfo Structure

DebugInfo is the bytecode information used for debugging, containing source file names, line numbers, local variables, and extended debug data. Reducing the line‑number information directly shrinks the DebugInfo region and thus the overall Dex file size.

2.1 Dex DebugInfo Layout

In a Dex file, DebugInfo resides in the data section as a series of debug_info_item entries. Each debug_info_item corresponds one‑to‑one with a method and consists of a header (start line, parameter count, parameter names) followed by a list of debug_event records that encode PC offsets and line offsets.

Dex DebugInfo mapping visualization
Dex DebugInfo mapping visualization

2.2 DebugInfo Usage Scenarios

DebugInfo is primarily used for breakpoint debugging and stack trace reconstruction (including crash, ANR, and memory analysis). When an exception occurs, the VM resolves the method’s PC to a line number via the DebugInfo, which is then displayed in the stack trace.

// Simplified flow of converting native stack trace to StackTraceElement[]
static jobjectArray Throwable_nativeGetStackTrace(JNIEnv* env, jclass, jobject javaStackState) {
    // ...
    return Thread::InternalStackTraceToStackTraceElementArray(soa, javaStackState);
}

The GetLineNumForPc implementation iterates over the debug_info_item to find the matching line number. Our optimization sets pcDelta to 1, which slightly increases iteration length but has negligible performance impact.

Existing Solutions

3.1 Extreme Optimization

Removing all DebugInfo eliminates line numbers in stack traces (they appear as -1). This is acceptable only for highly stable apps where debugging is unnecessary.

3.2 Mapping‑Based Optimization

Instead of removing DebugInfo, we keep it but change the relationship from one‑to‑one (method → debug_info_item) to many‑to‑one, allowing multiple methods to share the same debug_info_item. This reduces the number of items and thus the size. A mapping file records the original line numbers for later retracing.

Equality of two debug_info_item objects is defined as:

public boolean equals(DebugInfoItem other) {
    return this.startLine == other.startLine &&
           this.parameters.equals(other.parameters) &&
           this.events.equals(other.events);
}

Equality of individual debug_event objects is:

public boolean equals(DebugEvent other) {
    return this.type == other.type && this.value == other.value;
}

To enable reuse, we must keep startLine, the number of debug_event entries, and each event’s type, lineDelta, and pcDelta identical across methods.

3.3 Overloaded Method Line‑Number Overlap Issue

When two overloaded methods share the same start line range after mapping, the VM cannot disambiguate which method a stack‑trace line belongs to because the stack only contains the method name, not its signature.

// Example of overlapping mapping
com.example.MethodOverloadSample.test():          1 → 21
com.example.MethodOverloadSample.test(String):  1 → 34
// Stack trace only shows method name
at com.example.MethodOverloadSample.test(MethodOverloadSample.java:2)

Alipay Line‑Number Optimization

Solution 1

Extract all DebugInfo into a separate debugInfo.dex file, removing it from the main APK.

At crash time, hook Throwable to capture the instruction‑level line number and upload it.

The performance platform uses the uploaded debugInfo.dex to translate the instruction line back to the original source line.

This approach works only for Throwable‑based crashes and requires handling different JVM versions.

Solution 2

Keep a limited number of debug_info_item entries and treat their line numbers as instruction offsets (both lineDelta and pcDelta are set to 1). This makes every debug_event a special opcode, allowing full reuse of the item.

R8 Line‑Number Optimization

R8 preserves line numbers with -keepattributes LineNumberTable and then applies two modifications:

startLine : default 1; for overloaded methods the next method’s startLine becomes the previous method’s endLine + 1.

lineDelta : forced to 1.

Because debug_event count and pcDelta remain uncontrolled, reuse is limited.

R8 line‑number mapping
R8 line‑number mapping

Baidu App Implementation

Baidu’s solution controls four variables to maximize reuse:

startLine : default 100 000 to avoid overlap in hot‑fix or plugin scenarios.

debug_event count : set to the number of bytecode instructions in the method.

pcDelta : first special opcode is 0, subsequent ones are 1.

lineDelta : identical to pcDelta, so the line number reported is actually the instruction offset.

The generated mapping file has the format:

com.baidu.searchbox.Application:
    void onCreate(android.os.Bundle):
        [1000-1050] -> 20
        [1051-2000] -> 22
    void onCreate():
        [3000-3020] -> 30
        [3021-3033] -> 31

The Gradle plugin processes the Dex files before the packageApplication task, feeding the optimized Dex back into the build pipeline.

Performance Impact

Before optimization Baidu APK size was 123.58 MiB (Dex 37.42 MiB). After applying line‑number optimization the APK reduced to 120.54 MiB, saving 3.04 MiB (≈8 % of Dex size). Further class‑level line‑number allocation could save an additional ~400 KiB.

Server‑Side Line‑Number Retracing

After deployment, the app reports virtual line numbers. The performance platform receives these numbers together with the mapping file and restores the original source lines.

The pipeline consists of:

Upload of the mapping file during release (manual or CI‑driven).

Parsing service that stores the mapping in a multi‑level cache (in‑memory, Redis, and a persistent table).

Streaming computation that, for each crash event, looks up the virtual line number in the cache and outputs the real line number for downstream analytics.

End‑to‑end line‑number optimization flow
End‑to‑end line‑number optimization flow

The cache uses a W‑TinyLFU eviction policy and expires entries that have not been accessed for a configurable number of days, ensuring hot mapping data stays in memory for millisecond‑level lookup.

Conclusion

This article presented the anatomy of Dex DebugInfo, surveyed existing removal and mapping strategies, detailed Baidu’s line‑number optimization and its integration into the build system, quantified the APK size savings, and described the full server‑side retracing pipeline that turns virtual line numbers back into actionable source locations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AndroidPerformanceMonitoringDEXDebugInfoLineNumberOptimizationAPKSizeReduction
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.