Optimizing Rider Withdrawal Payments at Dada Express: Asynchronous Concurrency, Extended Timeouts, Idempotent Retries, and Security Enhancements
This article details how Dada Express improved the efficiency of rider withdrawal payments by redesigning the clearing and disbursement workflow, introducing asynchronous thread‑pool concurrency, extending API timeouts, implementing idempotent retry logic, and adding safeguards to ensure fund security, resulting in a three‑fold reduction in processing time during peak periods.
Background Introduction
As a leading local instant‑delivery platform, Dada Express has millions of active couriers. To comply with the national "Notice on Regulating Payment Innovation Business" and join the state security supervision system, Dada needed to cooperate with qualified banks to standardize fund clearing and settlement processes.
The internal account team designed the withdrawal service. During a peak day, withdrawal orders were three times the normal volume, causing some couriers to receive funds only after 20:00, far beyond the target 18:00‑19:00 window. After optimization, the same load was completed before 18:00.
Process Overview
In the courier withdrawal payment flow, Dada Finance first approves the request, then a "second‑clearing service" interacts with the bank system via a Job, finally completing the salary payment.
Terminology:
Clearing: preparation of data for bank network settlement.
Outflow: transfer of funds from the transaction account to the bank account.
Analogy: clearing is like preparing a payroll sheet, and outflow is the actual payment based on that sheet; outflow cannot occur before clearing is finished.
Current Situation Analysis
During month‑start, month‑end, and pre‑holiday peaks, the total withdrawal process exceeds four hours, causing:
Courier experience degradation as funds arrive after 19:00, leading to lower satisfaction and increased customer‑service load.
Rapid growth in withdrawal volume raises concerns about the scalability of the existing solution.
Log analysis shows:
The manual financial steps are already minimal under compliance requirements, leaving little room for further optimization.
Interaction with the bank system consumes 87.5% of the total processing time, revealing performance bottlenecks in the bank API.
Optimization Ideas
To address the identified issues, Dada implemented the following adjustments:
1. Asynchronous Concurrency
The bank API limits throughput to 18 TPS. Single‑threaded serial requests only used 13.9% of this capacity. By switching to a multi‑threaded thread‑pool (core and max size set to 18), the system can fully utilize the API limit.
Load testing with JMeter confirmed that an appropriate thread count approaches the maximum threshold.
Thus, the interaction with the bank changed from serial synchronous calls to independent asynchronous jobs executed by the thread‑pool.
2. Extended Timeout
With a 3‑second timeout, the bank API timed out in 10.5% of single‑threaded requests. By extending the timeout to 20 seconds, the timeout rate drops dramatically, while the job scheduler still runs every 10 minutes. The risk of thread blockage is limited to the 18 threads in the dedicated pool.
3. Idempotent Retry
Because the bank API is not idempotent and error codes are ambiguous, clearing and outflow timeouts cannot be simply retried. Instead, a verification (reverse‑lookup) API is used to achieve logical idempotency. If the verification also times out, multiple retries are performed within the same scheduling window.
4. Fund Security
To prevent duplicate or missed payments under concurrent requests, the following safeguards were added:
Business isolation: separating the clearing and outflow stages.
Amount verification: checking account balances before and after each stage.
Combining the two jobs could further improve efficiency, but such a redesign would be a major change; therefore, a incremental approach was chosen.
Online Validation
1. Results During Business Peaks
Clearing Job
First clearing took about 14 minutes.
Retry after a timeout took about 2 minutes.
Total clearing time from 12:10 to 12:40 was 30 minutes.
Outflow Job
First outflow took about 16 minutes.
Retry after a timeout took about 2 minutes.
Total outflow time from 14:00 to 14:30 was 30 minutes.
2. Effect Comparison
Summary and Reflection
The optimized withdrawal payment flow shows a clear efficiency gain without major architectural changes, allowing rapid delivery with low development cost; the entire design‑to‑deployment cycle took only one week. At the current throughput, the system can support up to four times the present business volume.
Initially, version 1.0 did not prioritize processing efficiency because the business scale was small; as the volume grew, the architecture needed to evolve in step with the business.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Dada Group Technology
Sharing insights and experiences from Dada Group's R&D department on product refinement and technology advancement, connecting with fellow geeks to exchange ideas and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
