Mobile Development 37 min read

App Startup Performance Optimization: Techniques and Tools for iOS and Android

Optimizing app launch on iOS and Android—through deep‑link handling, H5 splash strategies, On‑Demand Resources, custom WKWebView schemes, reduced download size, static and runtime .so loading flags, thread‑pool tuning, method swizzling, actor‑based concurrency, and using Instruments, MetricKit and Android Profiler—prevents user abandonment and boosts conversion.

Amap Tech

Dec 30, 2020

App Startup Performance Optimization: Techniques and Tools for iOS and Android

App launch is the first impression for users, and a slow start dramatically reduces conversion rates. Studies show that each additional second of waiting can cut conversion by 7%, and launches longer than 5 seconds cause 19% of users to abandon the app.

Optimizing Deep Link Launch with Universal Links and App Links

Deep‑linking improves acquisition and traffic flow. iOS uses Universal Links (available from iOS 9) and Android uses App Links (available from Android 6). Universal Links require user interaction to trigger; they cannot launch automatically. Custom URL schemes show a system prompt on iOS and are disabled on Android Chrome 25+, requiring the Intent wrapper.

When an H5 splash page is shown during launch, prioritize loading the target feature page first and defer heavy image loading to improve perceived speed.

H5 Launch Page Strategies

On iOS, On‑Demand Resources (ODR) can pre‑download assets after installation so the launch page loads locally. ODR can also store scripts to reduce bundle size and CDN costs. If ODR is not used, WKWebView can be pre‑loaded to cache resources for subsequent launches.

Three iOS approaches for intercepting network requests in WKWebView:

Use WKURLSchemeHandler for a custom scheme (iOS 11+).

Run a local server to redirect requests (high cost, slower start).

On Android, the system resource‑intercept API can set MIME type, encoding, and file stream for matched URLs.

Download Size Reduction

Large download sizes also hurt conversion; >200 MB triggers a user confirmation on 4G/5G. Moving the __TEXT segment to a custom section reduces download size by one‑third on pre‑iPhone X devices. Michael Eisel’s blog also suggests using ZippyJSONDecoder for faster JSON parsing and a custom linker zld for build‑time speedups.

Android .so Library Loading Optimization

Compile‑time static analysis

-ffunction-sections -fdata-sections // enable on‑demand loading
-fvisibility=hidden -fvisibility-inlines-hidden // hide symbols

These flags avoid pulling in unused modules.

Run‑time hook analysis

Android Linker flow: find_library returns a soinfo pointer, then call_constructors invokes init_array. Using frida‑gum, Gaode’s Android team hooks __dl_dlopen, find_library, and constructor calls to measure load times.

static target_func_t android_funcs_22[] = {
    {"__dl_dlopen", 0, (void *)my_dlopen},
    {"__dl_ZL12find_libraryPKciPK12android_dlextinfo", 0, (void *)my_find_library},
    {"__dl_ZN6soinfo16CallConstructorsEv", 0, (void *)my_soinfo_CallConstructors},
    {"__dl_ZN6soinfo9CallArrayEPKcPPFvvEjb", 0, (void *)my_soinfo_CallArray}
};

Thread‑pool management in libdispatch (GCD) is also examined. When the pool is full, pending blocks are marked and a new thread may be created after a back‑off period.

static void _dispatch_root_queue_poke_slow(dispatch_queue_global_t dq, int n, int floor) {
    bool overcommit = dq->dq_priority & DISPATCH_PRIORITY_FLAG_OVERCOMMIT;
    if (overcommit) {
        os_atomic_add2o(dq, dgq_pending, remaining, relaxed);
    } else {
        if (!os_atomic_cmpxchg2o(dq, dgq_pending, 0, remaining, relaxed)) {
            _dispatch_root_queue_debug("worker thread request still pending for global queue: %p", dq);
            return;
        }
    }
    // ... thread creation logic ...
}

iOS App Loading Details

Before _dyld_start, the kernel forks the process, parses the Mach‑O image, and activates it via exec_activate_image(). Reducing dynamic libraries and +load / constructor functions shortens launch time.

Measuring main‑thread method duration can be done by swizzling objc_msgSend. The hook records timestamps before and after the original implementation, allowing per‑method latency analysis.

ENTRY _objc_msgSend
    MESSENGER_START
    cmp x0, #0 // nil/tagged‑pointer check
    b.le LNilOrTagged
    ldr x13, [x0] // isa
    and x9, x13, #ISA_MASK // class
    // ... rest of assembly ...

Thread Scheduling and Task Orchestration

Tasks are categorized and scheduled based on dependencies. Independent tasks run in parallel, while dependent tasks run sequentially. Shared‑Preferences loading on Android can cause ContextImpl lock contention; merging files or serializing loads are possible mitigations.

IO‑heavy tasks must balance efficiency, accuracy, and importance. For example, NSData.writeToFile:atomically: calls fsync, which forces a disk write and can block the main thread.

OperationQueue and libdispatch interactions can lead to pending blocks when the thread pool is saturated. Monitoring pending counts and adjusting maxConcurrentOperationCount helps avoid stalls.

Actor Model and Future Concurrency

Actors provide isolated state and serial message processing, eliminating data races. Swift’s upcoming concurrency model will introduce actors, async/await, and structured concurrency, allowing safe, efficient parallelism.

Coroutines (e.g., via ucontext) can implement async/await before language support arrives. Although swapcontext is deprecated in modern POSIX, a custom lightweight implementation can be built from the Linux source.

uc[1].uc_link = &uc[0];
uc[1].uc_stack.ss_sp = st1;
uc[1].uc_stack.ss_size = sizeof st1;
makecontext(&uc[1], (void (*)(void))f, 1, 1);

Performance Analysis Tools

iOS Instruments – Time Profiler and Samples provide method‑level timing. MetricKit 2.0 adds automatic crash and signpost collection via MXMetricPayload.

Automation can be achieved through the usbmux protocol (used by usbmuxd) and libimobiledevice to drive Instruments, Appium, or Facebook’s idb for UI automation and performance data collection.

Android Studio’s Profiler offers similar flame‑graph visualizations for method execution time.

Operational Governance Platform

An APM platform integrates daily reports, CI gate checks, automated performance testing (record‑playback, Monkey), and post‑release monitoring. Alerts trigger bug tickets, and dashboards display trends, baselines, and detailed per‑scenario metrics.

For more details, see the linked article on Gaode’s iOS launch analysis.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization iOS Android multithreading Profiling app startup

Written by

Amap Tech

Official Amap technology account showcasing all of Amap's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.