Evolution of Grouped Concurrency Scheduling and the Self‑Driven Concurrency Model for E‑commerce Backend Services
This article analyzes the challenges of aggregating multiple RPC calls in e‑commerce app backends, explains simple and complex concurrency scenarios, introduces grouped concurrency scheduling, and presents a self‑driven concurrency model that automates dependency handling to improve response latency and maintainability.
1. Problem Background
When opening an app, users see aggregated content such as home pages, product lists, and detail pages. In e‑commerce apps, these information‑aggregation scenarios often require combining data from multiple sources, meaning a single user request may need to aggregate N internal RPC responses. To respond quickly, multiple RPC calls are issued asynchronously, which we refer to as a concurrency scenario.
2. Definition of Complex Concurrency Scenarios
2.1 Simple Concurrency Scenario
A relatively simple aggregation requires N independent RPC results, as illustrated by the first diagram.
2.2 Complex Concurrency Scenario
A more complex but common scenario involves multiple RPC responses with inter‑dependencies (e.g., request D depends on responses A and B, request E depends on C and D), as shown in the second diagram.
3. Evolution of Grouped Concurrency Scheduling Model
3.1 Simple Asynchronous Concurrency Scheduling
To improve server response speed, we can use basic mechanisms like Future to execute independent RPCs concurrently. This works well for simple scenarios, but for complex ones we need a different approach: grouped concurrency scheduling.
3.2 Grouped Concurrency Scheduling
Grouped scheduling is suited for complex scenarios where RPC queries have dependency relationships. The typical solution includes:
1. Grouping : Divide all RPC queries into groups based on dependencies (no prior dependency = first group, dependent queries = subsequent groups).
2. Scheduling : Using primitives such as CompleteFuture or Future , execute each group concurrently while synchronizing between groups (group‑level sync, intra‑group async).
To boost developer efficiency, we can wrap these primitives into a custom grouped concurrency tool, adding fine‑grained timeout control, circuit‑breaker, and degradation mechanisms to reduce governance overhead.
4. Evolution of the Self‑Driven Concurrency Scheduling Model
4.1 A Small Goal of Reducing Latency and Its Implementation
In Q2 2020, the goal was to bring the average latency of core platform interfaces below 90 ms (from ~120 ms). Actions taken included:
1. Analyzing each interface's contribution based on QPS to prioritize optimization.
2. Measuring latency at the code‑line level to understand every millisecond.
3. Adjusting concurrency scheduling: after discovering that strict grouping was not optimal, we extracted long‑running independent RPCs for global asynchronous scheduling.
By the end of Q2, the average latency dropped to 85 ms, surpassing the target.
4.2 Remaining Questions
After achieving the latency goal, new concerns emerged:
1. Maintenance remains cumbersome because complex concurrency code becomes heavily entangled over time.
2. Latency optimization becomes a repetitive cycle as business evolves, requiring frequent adjustments to grouping logic.
3. The trade‑off of breaking the grouping model for performance and then rebuilding it.
What should be done next for interface latency optimization?
4.3 Rethinking the Problem and the Birth of the Self‑Driven Model
4.3.1 Rethinking
Our development process can be visualized as drawing a graph:
1. Node creation : Adding a new RPC node for a feature (e.g., activity information).
2. Connecting nodes : Grouping nodes that can run concurrently based on dependencies.
3. Graph construction : Orchestrating group execution, data synchronization, and driving subsequent groups.
The overall concept is “points become lines, lines become surfaces”.
4.3.2 Self‑Driven Concurrency Model
From this perspective:
1. Strongly related business logic resides in the “points”.
2. Weakly related repetitive work lies in “connecting” and “graph weaving”.
Thus, could developers focus only on the points while a framework automatically handles the lines and surfaces?
The self‑driven model aims to provide exactly that: the framework automates the “movement” between points, allowing developers to concentrate on node behavior.
Design direction:
1. Development focus on node‑centric APIs.
2. Framework focus on automatically linking any two points to weave the whole graph.
5. Conclusion
This article reflects on the evolution of concurrency scheduling models, describing how a deeper understanding of the problem led to the self‑driven concurrency model, which is part of the “Avengers Alliance” technical ecosystem at ZuanZuan.
Author: Chen Qien, Backend Lead for ZuanZuan B2C supply side and initiator of the Avengers Alliance solution.
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.