Kuaishou’s APM Platform and Mobile Performance Optimization: Insights from Yang Kai
In a mobile‑first world where limited device resources and unstable networks threaten user retention, Kuaishou’s performance team built an APM monitoring platform and applied systematic memory, startup, and jank optimizations that cut startup time by 40%, reduced package size by 23 MB, and significantly improved key product metrics.
In the mobile‑internet era, constrained device resources and unstable networks make performance optimization essential for both web and mobile products; poor performance leads to user loss, especially for consumer‑facing (ToC) apps. At the GMTC Global Front‑End Conference, Kuaishou’s performance lead Yang Kai discussed the company’s APM platform and its approach to tackling performance challenges.
The APM platform was created in response to user reports of lag, crashes, and overheating, and to data showing performance’s strong impact on user activity. It now monitors crashes, OOM, ANR, jank, startup time, FPS, package size, and is extending to power and traffic monitoring, with day‑level and hour‑level alerts and a per‑merge‑request change tracking system.
Key results include a 40% improvement in startup speed, a 23 MB reduction in package size, and performance‑team awards for the optimizations.
Challenges such as heavy C++ memory usage were addressed by developing custom monitoring: using malloc hooks and compiler instrumentation to record live allocations, applying a mark‑and‑sweep analysis in a separate process, and building a Java memory‑snapshot pipeline that dumps, analyzes, and compresses images (90% compressed to ≤80 MB) using Shark and zstd.
The team follows a five‑step methodology—define, analyze, solve, accept, and prevent degradation—using AB testing, data‑driven prioritization, and mechanisms to avoid regression during iterative development.
For startup optimization, a task framework was introduced to collect per‑task timing, with online collection and offline analysis via systrace (Android) and a custom flame‑graph tool (iOS). Optimizations included task consolidation, deferring or removing tasks based on user state, addressing lock contention, reducing main‑thread CPU load, handling API‑induced initializations, proactive dex2oat, and binary/dex reordering, achieving a 40% reduction in startup time.
Jank was defined as any main‑thread operation exceeding one second. The team analyzes stack traces, CPU usage, lock waits, and system‑service delays, adding richer logging to pinpoint and mitigate the root causes.
Continuous monitoring and targeted optimizations have led to noticeable improvements in key metrics such as 0‑play, retention, and acquisition cost, especially for new users.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Laravel Tech Community
Specializing in Laravel development, we continuously publish fresh content and grow alongside the elegant, stable Laravel framework.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
