Big Data 11 min read

Adaptive Grouping Method for Improving AB Test Allocation Uniformity in Didi's Experiment Platform

This article introduces Didi's adaptive grouping algorithm, which enhances the uniformity of user allocation in AB experiments by replacing traditional complete randomization with a single-pass method that balances observed metrics across groups, and demonstrates its effectiveness through large‑scale experimental results.

DataFunTalk
DataFunTalk
DataFunTalk
Adaptive Grouping Method for Improving AB Test Allocation Uniformity in Didi's Experiment Platform

In data‑driven internet companies, AB testing is a crucial tool for making product and algorithm decisions, and Didi's Apollo AB testing platform supports thousands of weekly experiments using random and time‑slice grouping methods.

The standard complete randomization (CR) approach can produce uneven group distributions for key metrics such as GMV, leading to biased experiment analysis; rerandomization (RR) mitigates this by repeatedly running CR until a threshold is met, but at the cost of additional computation.

Didi introduces an Adaptive grouping algorithm that, in a single pass, records the cumulative sample count and metric distribution for each group, computes balance scores, and decides between direct and indirect allocation to keep metric distributions across groups as similar as possible.

The algorithm proceeds as follows: shuffle the population, assign the first 2 × K samples randomly to ensure each group has at least two members, initialize direct and indirect allocation probabilities, calculate balance coefficients (BS) for each group, perform direct allocation when BS differences are large, otherwise compute pre‑allocation scores using ANOVA‑derived means and variances, and finally update metric distributions after each assignment.

System design reuses Apollo's offline grouping infrastructure: after an Adaptive experiment is created, a grouping task is stored in the database, fetched by the task manager, and processed by the Adaptive grouping service, which reads experiment metadata, fetches metric data from Hive, performs the algorithm, and writes results to HDFS.

Empirical evaluation on a 10,000‑driver sample shows that Adaptive grouping achieves over 95% of runs with inter‑group metric differences below 0.8%, outperforming CR (differences up to 14%) and RR (differences up to 2.7%).

In summary, the Adaptive grouping capability significantly improves the precision of random‑group experiments, reduces ineffective tests, and shortens experiment cycles, while acknowledging that already‑completed CR experiments cannot be retroactively rebalanced.

AB testingalgorithmdata-drivenexperiment platformDidiadaptive groupingrandomization
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.