Mobile Development 28 min read

How Baidu Boosted Android App Startup Using Perfetto and Auto-Instrumentation

This article details Baidu's comprehensive approach to improving Android app launch performance by evaluating existing tracing tools, selecting Perfetto, developing a Gradle-based automatic instrumentation plugin, handling trace collection and analysis with Trace Processor, and implementing automated detection of regressions, lock contention, and method-level CPU and wall‑time degradations.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
How Baidu Boosted Android App Startup Using Perfetto and Auto-Instrumentation

Background

App launch performance on low‑end Android devices often suffers from stalls and black screens. Existing tools (TraceView, CPU Profiler, Systrace, Perfetto) either add too much overhead or lack the flexibility needed for systematic analysis, prompting the development of a custom solution.

Tool Selection

Perfetto was chosen because it provides low‑overhead tracing, supports both kernel‑space (ftrace) and user‑space data sources, and includes the Trace Processor, which parses trace files into an in‑memory SQLite database that can be queried via SQL or a Python API.

Perfetto Overview

Perfetto consists of three components:

Record : collects traces.

Analyze : the Trace Processor parses traces into a SQLite‑backed database.

Visualize : a web UI for flame‑graph inspection.

Trace Collection

Traces are collected with the record_android_trace script, which abstracts Android version differences. Example command:

./record_android_trace -c atrace.cfg -n -o trace.html

The configuration file defines buffer size, duration, and the events to capture (e.g., sched/sched_switch, dalvik, specific package names).

Trace Analysis

Trace Processor supports multiple input formats (Perfetto protobuf, Linux ftrace, Android systrace, Chrome JSON, etc.) and exposes a Python API for automated analysis.

Automatic Instrumentation Plugin

A Gradle Transform plugin built with ASM injects android.os.Trace.beginSection and Trace.endSection at the entry and exit of every Java/Kotlin method, guaranteeing full‑method coverage without manual changes.

Ensuring Paired Calls

To avoid “Did Not Finish” errors, the plugin wraps the original method body in a try … finally block so that Trace.endSection is always executed, even when exceptions occur.

public void testMethod() {
    try {
        Trace.beginSection("com.example.Test.testMethod");
        // method body
    } finally {
        Trace.endSection();
    }
}

Instrumenting System Calls

System‑class methods (e.g., Object.wait) cannot be transformed directly. The plugin replaces the bytecode instruction INVOKEVIRTUAL java/lang/Object.wait (JI)V with a call to a custom static wrapper

com/baidu/systrace/SystraceInject.wait (Ljava/lang/Object;JI)V

, which adds tracing before delegating to the original implementation.

INVOKEVIRTUAL java/lang/Object.wait (JI)V →
INVOKESTATIC com/baidu/systrace/SystraceInject.wait (Ljava/lang/Object;JI)V

Performance and Size Optimizations

Full‑method instrumentation adds roughly 10 MB to the APK. A blacklist mechanism excludes trivial getters/setters and other low‑impact methods, reducing both APK size and runtime overhead. Additionally, a custom high‑performance EventBus implementation lowered perceived latency by ~50 ms, demonstrating the importance of targeted optimizations.

Automated Trace Analysis Pipeline

Build baseline and test APKs with the instrumentation plugin.

Run launch tests on real devices and collect Perfetto traces.

Use the Trace Processor Python API to execute SQL scripts that detect:

Results are visualized in the Perfetto UI and can be automatically routed to responsible teams, reducing manual analysis from two person‑days to an almost fully automated workflow.

Best Practices and Conclusions

The combination of Perfetto, automatic bytecode instrumentation, and scripted analysis provides a scalable solution for continuous launch‑performance monitoring. While the approach introduces some APK size and runtime cost, further refinements—such as selective instrumentation levels and integration of the Perfetto SDK—can mitigate overhead.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceInstrumentationAndroidAutomationtracingPerfetto
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.