Mobile Development 21 min read

How Baidu App Accelerated Android Startup: A Deep Dive into Launch Optimization

This article analyzes Baidu App's Android launch performance, explaining the startup process, identifying bottlenecks, and detailing practical optimizations—including task scheduling, KV storage redesign, lock improvements, and low‑level system tweaks—backed by code samples and measurable results.

Baidu Geek Talk

Jul 10, 2023

How Baidu App Accelerated Android Startup: A Deep Dive into Launch Optimization

Startup Process Theory

The Android launch consists of creating the application object, starting the main thread, creating the main Activity, inflating views, laying out the screen, and performing the first draw. After the first draw the system swaps the background window for the main activity, allowing user interaction.

All launch paths (cold start from icon, push‑triggered, browser‑triggered, etc.) share four core phases: process creation, framework loading, home‑page rendering, and pre‑loading. Optimizing every path, not only the icon click, is required for a high‑quality experience.

Key System Processes

Launcher – receives user clicks and notifies AMS.

ActivityManagerService (AMS) – schedules activity launches and manages process lifecycles.

Zygote – forks the app process, pre‑loading the VM and core libraries.

SurfaceFlinger – handles rendering, VSync, and buffer management.

Optimization Implementation

1. Conventional Optimizations

Early‑stage products benefit from simple measures such as delaying non‑essential work, loading resources asynchronously, and removing dead code. Performance tools like Trace and Thor Hook are used to locate hot spots.

2. Mechanism‑Level Optimizations

Task Scheduling

A custom scheduler balances startup speed with business pre‑loading. It consists of three modules:

Device scoring – combines static hardware info and dynamic performance metrics.

Tiered configuration – cloud‑based tables with local fallback.

Tiered dispatch – different dispatch logic for high‑end, mid‑range, and low‑end devices.

Scheduling strategies include personalized task ordering, tiered experience based on device score, scene‑specific scheduling, priority‑aware delayed tasks, and parallel UI rendering for splash screens.

KV Storage Optimization

SharedPreferences (SP) suffers from slow XML read/write, multi‑process conflicts, and thread‑creation overhead. Two complementary solutions are deployed:

UniKV – a drop‑in replacement that implements the SP API but stores data in a binary format with a 4 KB block layout, mmap‑based access, and built‑in disaster‑recovery fields.

System‑level SP tweaks – improve lock handling and I/O buffering for components that cannot adopt UniKV.

UniKV design highlights:

40‑byte file header containing version, write count, CRC, and data length.

Append‑only data blocks supporting BOOL, INT, FLOAT, DOUBLE, SHORT, LONG, STRING, STRING_ARRAY, BYTE_ARRAY.

Background migration reads old SP files, writes to KV, and switches when complete, using a reserved flag to track progress.

private final Object mLock = new Object();
private boolean mLoaded = false;
private void startLoadFromDisk() { ... }
public String getString(String key, @Nullable String defValue) {
    synchronized (mLock) {
        awaitLoadedLocked();
        String v = (String) mMap.get(key);
        return v != null ? v : defValue;
    }
}

Lock Optimization

Excessive synchronized blocks caused ANR‑related stack traces. Replacing heavy locks with lock‑free structures or fine‑grained custom locks reduced read latency from ~118 ms to ~6 ms on a Xiaomi 5 device (≈95 % improvement).

Additional Mechanism Optimizations

Thread policy enforcement – unified thread pools, prohibit manual thread creation and priority changes.

I/O buffering – enlarge buffers to reduce syscalls.

SO loading – defer non‑essential native libraries to background threads.

Binder usage – cache results to avoid unnecessary IPC.

ContentProvider/FileProvider lazy initialization – move heavy providers to separate processes or load on demand.

Image prepareToDraw – trigger GPU upload early to avoid main‑thread stalls.

Low‑Level Optimizations

Explorations include VerifyClass verification skipping, CPU booster techniques, and GC tuning. These high‑risk, high‑reward changes are applied selectively after cost‑benefit analysis.

Conclusion

Android launch optimization is a multi‑layered challenge involving business logic, system services, and low‑level mechanisms. The systematic approach—combining conventional tweaks, a custom task scheduler, the UniKV key‑value store, lock refactoring, and broader system optimizations—delivers measurable reductions in launch time, ANR occurrences, and improves overall user retention.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

optimization Android task scheduling SharedPreferences performance tools Startup Performance Lock Refactoring UniKV

Written by

Baidu Geek Talk

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.