Game Development 14 min read

Pre‑Experiment User Stratification Model for Improving AB Test Uniformity in Vivo Game Center

The paper introduces a pre‑user stratification model that uses covariate‑balancing algorithms to create separate strata for distribution and revenue metrics, ensuring equal user allocation in Vivo game‑center AB tests, which reduces metric variance, improves gray‑release effectiveness, and saves significant investigation effort.

vivo Internet Technology

Aug 2, 2023

Pre‑Experiment User Stratification Model for Improving AB Test Uniformity in Vivo Game Center

AB testing is a core method for validating product and version updates in the Vivo game center. However, user imbalance—differences in the distribution of user attributes between treatment and control groups—can severely bias the evaluation of experiment effects, leading to misleading business decisions.

The article first defines user imbalance, explains its causes (grouping methods, sample size, and metric characteristics such as sparsity and non‑normality), and illustrates how it manifests in both version‑gray‑scale experiments and strategy‑optimization experiments.

To address this, the authors propose a “pre‑user stratification” solution that leverages a stratified sampling (covariate‑balancing) algorithm developed by the Hawking experiment team. The approach builds separate stratification models for distribution‑related metrics and revenue‑related metrics, then draws equal numbers of users from each stratum into the treatment and control groups.

The conventional stratified sampling formula is presented, followed by detailed designs of the revenue‑stratification model (using intermediate variables to segment users for ARPU balance) and the distribution‑stratification model (using similar variables for download/activation metrics). Visual diagrams of both models are included.

Implementation required integration with the Hawking experiment platform and the version‑release system. The stratification logic was embedded into the platforms to ensure uniform user allocation during experiments and gray releases.

Two AA tests were conducted:

On the Hawking experiment platform, the stratified model preserved distribution metric stability while reducing revenue metric variance from 11.6% (hash‑based grouping) to 4.8%/1.9% and 3.3%/1.5% for two ARPU calculations.

On the version‑release system, the stratified model eliminated significant fluctuations in distribution metrics that were present with the previous phone‑ID‑tail‑number grouping.

After deployment, the model yielded measurable business benefits: gray‑release effectiveness increased by 9 percentage points, annual anomaly‑investigation effort saved ~35 person‑days, and positive strategy experiments contributed an estimated +0.2 % to yearly revenue.

The authors acknowledge remaining challenges—subjectivity in manual stratification and limited indicator coverage—and suggest future work incorporating more features and machine‑learning‑based stratification.

Overall, the pre‑user stratification model provides a practical, data‑driven method to improve AB test uniformity and reliability in large‑scale game analytics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AB testing Sampling experiment design Game Analytics user stratification

Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.