Industry Insights 17 min read

How Kuaishou Uses Causal Inference to Optimize Live‑Streaming Experiments

This article analyzes Kuaishou's live‑streaming ecosystem, detailing causal‑inference frameworks, observational and experimental techniques such as DID, double machine learning, causal forests, uplift meta‑learners, and complex experiment designs like dual‑sided and time‑slice rotation to evaluate product and recommendation strategies.

Kuaishou Tech

Aug 13, 2021

How Kuaishou Uses Causal Inference to Optimize Live‑Streaming Experiments

Causal Inference Problems and Technical Framework in Kuaishou Live

Kuaishou live streaming faces four recurring analytical challenges: (1) designing user incentives, (2) evaluating recommendation strategies, (3) iterating product features, and (4) estimating long‑term ecosystem value. Three complementary solution families are applied:

Causal inference from observational data – extracting treatment effects from existing experiment and non‑experiment logs.

Rigorous A/B experiments – constructing randomised control groups, defining metrics, and measuring product impact.

Hybrid counterfactual reasoning – integrating economic models, machine‑learning algorithms, and experimental data to answer long‑term ecological questions.

The methodological core is causal inference, implemented through two widely used frameworks.

Rubin Potential‑Outcome Model

This model treats each user as having two potential outcomes: one under treatment and one under control. Because only one outcome is observed, a suitable control group must be identified. Common implementations include randomized controlled trials (RCT), standard A/B tests, and observational matching (e.g., propensity‑score matching, nearest‑neighbor matching). The average treatment effect (ATE) is estimated as the difference in expectations between the two potential outcomes.

Pearl Causal‑Graph Model

Pearl’s approach represents causal relationships as a directed acyclic graph (DAG). By applying the do‑calculus and computing conditional distributions along the graph, one can remove bias from confounders and identify distributional changes caused by interventions. Unlike Rubin’s focus on average effects, the graph model enables identification of full post‑intervention distributions and multi‑variable interactions.

Both frameworks are complementary: Rubin provides a clear ATE estimate, while Pearl’s graph facilitates causal discovery and the estimation of distributional shifts.

Causal‑Inference Techniques on Observational or Experimental Data

1. Product‑Feature Evaluation – Difference‑in‑Differences (DID) and Extensions

DID removes unobservable individual fixed effects by differencing outcomes before and after a policy change across treated and control groups. The key assumptions are (a) parallel trends prior to treatment and (b) a time‑invariant treatment effect. Extensions address heterogeneous user states by stratifying users, estimating separate DID effects, and aggregating them with appropriate weights. When a single control group is unavailable, synthetic‑control methods learn pre‑treatment weights to construct a virtual control that mimics the counterfactual trajectory.

2. Recommendation‑Strategy Evaluation – Causal Inference + Machine Learning

Pure predictive models excel at accuracy but do not satisfy causal‑identification assumptions. The following methods combine ML flexibility with rigorous causal estimation.

Double Machine Learning (DML) : Split the data into a training set (to learn high‑dimensional nuisance functions for treatment and outcome) and an estimation set. Orthogonalize the treatment residuals against the outcome residuals, then solve a moment condition to obtain an unbiased treatment‑effect estimate with valid confidence intervals.

Causal Forest : Build a decision‑tree ensemble on one data split to define heterogeneous subpopulations. Use a second split to estimate the conditional average treatment effect (CATE) and its variance within each leaf, employing variance‑adjusted splitting criteria.

Meta‑Learners (S‑Learner, T‑Learner, X‑Learner) : Fit separate models for treated and control outcomes (T‑Learner) or a single model with treatment indicator (S‑Learner). The X‑Learner combines both, first estimating treatment effects on each group and then refining them with a second‑stage learner. These approaches are computationally efficient but may incur higher bias if the base learners are misspecified.

3. User‑Behavior Chain Research – Causal Graphs

To uncover multi‑step interactions and effective variables, causal‑graph discovery algorithms are employed. Two main families are used:

Constraint‑based algorithms (e.g., PC, FCI) that test conditional independencies to prune edges.

Score‑based algorithms (e.g., Greedy Equivalence Search) that assign a penalised likelihood score to candidate DAGs and search for the optimum.

Complex Experiment Designs for Network Effects

1. Dual‑Sided Experiments

Both anchors (streamers) and viewers are simultaneously randomised. One side receives a treatment (e.g., a pendant) while the other does not. This design captures cross‑side spillover and yields a more accurate attribution of treatment effects because it observes how changes on the streamer side affect viewer behaviour and vice‑versa.

When spillover becomes highly entangled (e.g., live PK moments), dual‑sided designs lose power, motivating alternative designs.

2. Time‑Slice Rotation Experiments

Experiment and control conditions rotate across predefined time slices. Key design parameters are:

Length of each slice (shorter slices reduce systematic bias but increase variance).

Total experiment duration.

Randomisation of the switch point to prevent user anticipation.

Optimal‑design assumptions:

Outcome variable has a known absolute upper bound.

Users cannot predict whether the next slice will be treatment.

Inter‑slice interference is fixed and limited.

If the optimal switch point is unknown, a two‑stage procedure is used: (1) a pilot experiment estimates a lower bound for carry‑over effects; (2) the estimated bound informs the choice of slice length and switch timing, followed by a confirmatory experiment. This approach lengthens the overall cycle and may obscure heterogeneous treatment effects.

Key Technical Q&A (Condensed)

When to prefer DID over A/B testing? Use DID when treatment timing varies across groups or when a fully randomised control is infeasible (e.g., policy changes affecting only a subset of users). A/B testing is preferred for contemporaneous, randomised assignments.

Difference between Double Machine Learning and Propensity‑Score Matching? Propensity‑Score Matching (PSM) assumes a correctly specified propensity model and directly matches treated and control units. DML handles high‑dimensional confounders by orthogonalising nuisance estimates, using sample splitting and a moment‑condition, thus providing robustness to model misspecification.

How to mitigate violations of the Conditional Independence Assumption (CIA)? Employ instrumental‑variable techniques, refined matching (e.g., coarsened exact matching), or incorporate domain knowledge to construct quasi‑experiments that approximate independence.

Are causal‑graph structures learned or pre‑specified? Primarily learned from data using the algorithms above, but practitioners can impose structural constraints (e.g., fixing parent‑child relationships) based on domain expertise.

How to assess causal‑graph accuracy? Validate with simulated data where the true DAG is known, and perform limited experimental checks (e.g., intervene on a subset of edges) to test robustness of inferred relationships.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AB testing machine learning live streaming causal inference experiment design Kuaishou difference-in-differences

Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.