Artificial Intelligence 25 min read

How Ctrip’s Non‑User AB Testing Split Algorithm Boosts Experiment Efficiency

This article presents Ctrip’s novel non‑user AB testing split algorithm that combines optimized random sampling, greedy exchange, and graph‑based community detection to achieve balanced metric distribution, reduce user traffic cross‑over, and dramatically improve split efficiency in real‑world hotel marketing experiments.

Ctrip Technology

Jul 3, 2025

How Ctrip’s Non‑User AB Testing Split Algorithm Boosts Experiment Efficiency

Background

AB testing is the gold standard for effect evaluation and strategy iteration in the internet industry. Most experiments split traffic by user identifiers, but many business scenarios require non‑user side experiments where the experimental unit is a business entity such as a product, content, or merchant.

1.1 Non‑User AB Experiment Overview

In non‑user experiments the split entity is not a user but an entity like a hotel or product, and the experiment must keep the entity groups stable over time while avoiding cross‑exposure to the same users.

1.2 Differences Between Non‑User and User Split

Non‑user entities are fewer, have more stable and concentrated features, and often exhibit strong correlations; direct hash‑based splitting leads to imbalance, unstable pre‑splits, and severe user‑level traffic cross.

Problem Definition

Given N non‑user entities with grouping variables (e.g., category, price) and metric variables (e.g., historical sales, conversion), the goal is to assign each entity to a Treatment Group or Control Group such that sample size ratios are close to targets, metric distributions are aligned, and user‑level traffic cross is minimized.

2.1 Balanced Stratified Metrics

For each sub‑layer defined by the cross‑product of grouping variables, ensure that the number of entities and the distribution of each metric variable are similar between groups, minimizing the maximum relative difference.

2.2 User Traffic Isolation

Enforce entity‑level exclusivity (an entity belongs to only one group) and user‑level isolation (a user should be exposed to entities from only one group as much as possible).

Method

The solution consists of three independent, plug‑in modules that together form the final split algorithm.

3.1 Partitioned Random Sampling

An optimized random‑sampling initializer creates high‑quality candidate groups by stratifying entities, sorting them by prioritized metrics, and evenly bucketizing within each sub‑layer. This accelerates convergence of subsequent optimization.

3.2 Greedy Exchange

Starting from the initial solution, the greedy exchange iteratively swaps pairs of entities within exchange buckets, selecting the pair that yields the largest reduction in the maximum relative metric difference, until convergence or a maximum iteration count is reached.

3.3 Graph‑Based Greedy Split

A weighted user‑entity graph is built from historical exposure data, where nodes are entities and edge weights reflect the number of common users. Louvain community detection clusters frequently co‑visited entities into communities, which become new split units. The greedy exchange is then applied to these community units, and the final assignment is mapped back to the original entities.

Empirical Evaluation

The algorithm was deployed in a Ctrip hotel marketing AB test. Three methods were compared: (1) Graph‑greedy split (proposed), (2) Prior‑knowledge greedy split, and (3) Prior‑knowledge random split.

Results show that the graph‑greedy method achieved a 93.3% pass rate (42/45 attempts) with an average metric deviation of 2.37% and modularity of 0.300, while reducing user‑UV cross‑rate to 38‑40%. Prior‑knowledge greedy had a 44% pass rate and lower modularity (0.105), and prior‑knowledge random never met the precision requirement.

Conclusion

The graph‑greedy split algorithm dramatically improves split efficiency and reduces user traffic cross without sacrificing metric balance, providing a practical solution for non‑user AB experiments in complex business scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AB testing graph community detection greedy optimization non‑user experiments split algorithm

Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.