How Ctrip’s Non‑User AB Testing Split Algorithm Boosts Experiment Efficiency
This article presents Ctrip’s novel non‑user AB testing split algorithm that combines optimized random sampling, greedy exchange, and graph‑based community detection to achieve balanced metric distribution, reduce user traffic cross‑over, and dramatically improve split efficiency in real‑world hotel marketing experiments.
Background
AB testing is the gold standard for effect evaluation and strategy iteration in the internet industry. Most experiments split traffic by user identifiers, but many business scenarios require non‑user side experiments where the experimental unit is a business entity such as a product, content, or merchant.
1.1 Non‑User AB Experiment Overview
In non‑user experiments the split entity is not a user but an entity like a hotel or product, and the experiment must keep the entity groups stable over time while avoiding cross‑exposure to the same users.
1.2 Differences Between Non‑User and User Split
Non‑user entities are fewer, have more stable and concentrated features, and often exhibit strong correlations; direct hash‑based splitting leads to imbalance, unstable pre‑splits, and severe user‑level traffic cross.
Problem Definition
Given N non‑user entities with grouping variables (e.g., category, price) and metric variables (e.g., historical sales, conversion), the goal is to assign each entity to a Treatment Group or Control Group such that sample size ratios are close to targets, metric distributions are aligned, and user‑level traffic cross is minimized.
2.1 Balanced Stratified Metrics
For each sub‑layer defined by the cross‑product of grouping variables, ensure that the number of entities and the distribution of each metric variable are similar between groups, minimizing the maximum relative difference.
2.2 User Traffic Isolation
Enforce entity‑level exclusivity (an entity belongs to only one group) and user‑level isolation (a user should be exposed to entities from only one group as much as possible).
Method
The solution consists of three independent, plug‑in modules that together form the final split algorithm.
3.1 Partitioned Random Sampling
An optimized random‑sampling initializer creates high‑quality candidate groups by stratifying entities, sorting them by prioritized metrics, and evenly bucketizing within each sub‑layer. This accelerates convergence of subsequent optimization.
3.2 Greedy Exchange
Starting from the initial solution, the greedy exchange iteratively swaps pairs of entities within exchange buckets, selecting the pair that yields the largest reduction in the maximum relative metric difference, until convergence or a maximum iteration count is reached.
3.3 Graph‑Based Greedy Split
A weighted user‑entity graph is built from historical exposure data, where nodes are entities and edge weights reflect the number of common users. Louvain community detection clusters frequently co‑visited entities into communities, which become new split units. The greedy exchange is then applied to these community units, and the final assignment is mapped back to the original entities.
Empirical Evaluation
The algorithm was deployed in a Ctrip hotel marketing AB test. Three methods were compared: (1) Graph‑greedy split (proposed), (2) Prior‑knowledge greedy split, and (3) Prior‑knowledge random split.
Results show that the graph‑greedy method achieved a 93.3% pass rate (42/45 attempts) with an average metric deviation of 2.37% and modularity of 0.300, while reducing user‑UV cross‑rate to 38‑40%. Prior‑knowledge greedy had a 44% pass rate and lower modularity (0.105), and prior‑knowledge random never met the precision requirement.
Conclusion
The graph‑greedy split algorithm dramatically improves split efficiency and reduces user traffic cross without sacrificing metric balance, providing a practical solution for non‑user AB experiments in complex business scenarios.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
