How Alipay Engineers Built a Fine‑Grained Scheduling System for Mobile Apps
This article details Alipay's evolution from basic thread pools to a sophisticated, multi‑layered scheduling framework that balances UI responsiveness, task prioritization, and resource control across Android devices, while outlining the technical challenges and future directions of fine‑grained scheduling.
Background
Since the Unix era, resource allocation on Android (a Linux‑based OS) has been a core concern. Processes contain threads to better utilize CPU, leading to competition for resources and the development of various schedulers (FIFO, RT, CFS, etc.). Alipay, with its massive user base and complex services, faces intense internal and external resource contention, requiring a tailored scheduling solution.
Performance System Evolution
Alipay's performance optimization has progressed through three stages.
Prototype Stage (1.0 Thread Scheduling)
Early versions relied on native thread pools, causing overload and lack of unified control. In version 1.0, a unified thread‑pool service was introduced, offering different thread types based on task priority:
UI‑foreground threads (highest priority)
First‑class urgent tasks for home‑page rendering
Second‑class urgent tasks (cannot tolerate queuing)
Normal background tasks (tolerate queuing)
File‑IO tasks (predictable duration)
Network tasks (e.g., RPC calls)
Key‑based ordering: tasks with the same key execute serially, different keys run concurrently
Thread count is dynamically adjustable per pool.
Legacy Issues of 1.0
Increasing business modules exposed new problems: high‑frequency cold‑start preloads cause sustained CPU spikes, and the scheduling granularity remained at the thread level, unable to prevent uncontrolled task submissions.
2.0 Task Scheduling
To gain finer control, a task‑level scheduling layer was added, introducing a task‑coloring mechanism that separates exclusive (e.g., scan‑code) from public tasks, and a "Captain" chain scheduler inspired by Google WorkManager.
Key Technologies
Task Coloring : exclusive colored thread pools with higher priority; public tasks remain in the common pool.
Captain Scheduler : builds a worker task family, wraps Runnables as workers, and controls concurrency and execution order.
Scheduling triggers can be customized (main‑thread load, frame‑rate change, CPU usage).
3.0 Scheduling Upgrade
Version 3.0 addresses 2.0 shortcomings by using AOP to intercept all Java thread/task creation, reducing integration cost and eliminating missed coloring. A task‑tree is constructed to propagate coloring across dependencies, enabling full‑process monitoring.
Key Technologies
AOP Interception : captures all thread/task start points.
Task‑Tree Construction : records parent‑child relationships to propagate color tags.
Monitoring : analyzes execution time, proportion, and frequency per scene.
Timed‑Waiting Control : non‑foreground tasks are put to sleep based on calculated timeout, then awakened when appropriate.
Compared with 2.0, integration cost is lower, no coloring leaks, and the scheduling scope expands from framework thread pools to all Java threads.
Remaining Challenges
Single‑dimensional scheduling capability.
Lack of business‑level task ownership.
Overuse of asynchronous hooks without standards.
Missing control at critical nodes leading to high concurrency.
Unclear dependency relationships.
Future Outlook: Fine‑Grained Scheduling
The next generation will adopt "on‑demand loading" to provide unified control, modular plug‑in scheduling, device‑level performance grading, scene‑aware task chaining, and a centralized decision center that integrates product, operations, and intelligent user‑behavior predictions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alipay Experience Technology
Exploring ultimate user experience and best engineering practices
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
