Artificial Intelligence 14 min read

Causal Inference and Experiment Design in Kuaishou Live Streaming: Methods and Case Studies

This article explains how Kuaishou applies causal inference frameworks, such as Rubin's potential outcomes and Pearl's causal graphs, together with machine‑learning techniques like double‑machine learning, causal forests, and meta‑learners to evaluate product features, recommendation strategies, and user behavior under complex network effects in live streaming.

DataFunSummit
DataFunSummit
DataFunSummit
Causal Inference and Experiment Design in Kuaishou Live Streaming: Methods and Case Studies

In the Kuaishou live‑streaming environment, four major causal inference problems arise: user incentive design, recommendation strategy evaluation, product feature iteration, and long‑term value estimation.

Solutions include extracting causal relations from observational data, designing proper A/B experiments, and combining economic models with machine‑learning to construct counterfactuals for long‑term impact assessment.

The core of causal inference is separating correlation from causation, selecting appropriate models, estimating causal effects, and validating them statistically.

Rubin's Potential Outcome Model focuses on finding suitable control groups to estimate unobserved treatment effects, often using RCTs or A/B tests.

Pearl's Causal Graph Model represents variables as directed edges, allowing conditional distribution calculations to eliminate bias from confounders.

Both frameworks are complementary: Rubin estimates average treatment effects, while Pearl identifies distributional changes and complex variable interactions.

Case studies:

Product feature evaluation using Difference‑in‑Differences (DID) and its extensions, including corrected DID that accounts for heterogeneous user response times.

Synthetic control methods to construct virtual control groups when a single control is unavailable.

Recommendation strategy assessment via double‑machine‑learning, which orthogonalizes high‑dimensional confounders to obtain unbiased causal estimates.

Causal forest (causal tree) models that split data into training and estimation sets, using variance‑adjusted splits to capture heterogeneous treatment effects.

Meta‑learner uplift modeling (S‑Learner, T‑Learner, X‑Learner) for identifying sensitive user segments.

Complex experiment designs address network effects:

Bilateral experiments simultaneously split both hosts and viewers, enabling detection of spillover and cross‑group effects.

Time‑slice rotation experiments repeatedly switch treatment and control groups over time, requiring careful selection of slice length, total period, and random switch points.

Optimal design assumes bounded outcomes, unknown treatment timing to users, and limited, fixed interference between slices; it estimates the optimal switch point via preliminary experiments.

Q&A highlights differences between DID and A/B testing, the contrast between double‑machine‑learning and propensity‑score matching, handling violations of the CIA assumption, and whether causal graphs are pre‑specified or learned.

Overall, the talk demonstrates how integrating causal inference frameworks with modern machine‑learning tools enables systematic, data‑driven product evaluation and strategy optimization in large‑scale live‑streaming platforms.

machine learningA/B testingcausal inferenceexperiment designKuaishouproduct evaluation
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.