Unlocking Causal Inference: Practical AB Testing and Observational Study Techniques
This article explains how the Huolala data‑science team tackles AB‑testing challenges, pre‑experiment differences, observational (non‑AB) studies, and advanced causal‑inference methods such as CACE, heterogeneous treatment effects, mediation modeling, regression discontinuity, and instrumental variables to derive reliable business insights.
Introduction
The Huolala data‑science team frequently faces experimental challenges when supporting order distribution and product feature iteration, including the need to mitigate the impact of supply‑side competition on AB experiments, evaluate low‑traffic or low‑revenue experiments, and assess full‑population strategies in observational studies.
Catalogue
AB Experiments
Observational Studies (Non‑AB Experiments)
Summary
Part One – AB Experiments
In AB experiments, homogeneity means that, absent any intervention, the observed metrics of the treatment and control groups would be similar (the unobservable “counterfactual”). When groups are heterogeneous, the observed metric difference conflates the true average treatment effect (ATE) with pre‑experiment differences.
1. Flowchart
2. Pre‑experiment Differences
When experimental groups differ before the intervention, the metric gap consists of the true effect (ATE) plus the pre‑experiment difference.
In random split scenarios, large noise (high variance) creates non‑ignorable pre‑differences.
Unequal split leads to heterogeneous groups, producing pre‑differences.
2.1 Application Scenarios
Random split with excessive variance or uneven allocation causing heterogeneous groups.
2.2 Available Techniques
Illustrated with diagrams (see images).
3. Non‑Compliant Homogeneous Users
Typical AB experiments control treatment exposure for the treatment group and no exposure for the control group. In some cases, the treatment group does not receive the intervention while the control group self‑selects into it, leading to a mismatch between expected and actual effects that must be corrected.
3.1 Available Techniques
Use CACE (Complier Average Causal Effect) to estimate the true effect on the compliant subgroup. Directly using ACE would underestimate the impact.
3.2 CACE vs. Propensity Score Matching (PSM)
CACE focuses on the compliant population, while PSM attempts to balance covariates across groups.
4. Heterogeneous Treatment Effects (HTE)
Strategies that work well for one user segment may not work for another. The workflow is: train an HTE model on AB data → identify optimal strategies per user → personalize the strategy and validate via AB testing.
4.1 Available Techniques
Uplift modeling
Quantile regression
4.2 Quantile Regression Case
Standard AB evaluation uses ATE, which ignores effect heterogeneity. Quantile Treatment Effects (QTE) capture this heterogeneity.
Part Two – Observational Studies (Non‑AB Experiments)
Observational studies lack random assignment but have pre‑ and post‑intervention time‑series data. The goal is to estimate the counterfactual outcome Y′ (what would have happened without intervention) and compute the effect as Y – Y′.
1. Flowchart
2. Pre‑ and Post‑Intervention Time‑Series
Applicable when there is no AB experiment or only a quasi‑experiment, and when multiple time points of the outcome Y are observed.
2.1 Application Scenarios
Time‑series data with and without intervention.
2.2 Techniques
Difference‑in‑Differences (DID) – assumes parallel trends across treated and untreated units.
Synthetic Control Method (SCM) – builds a synthetic counterfactual from multiple untreated units.
Bayesian Structural Time Series (BSTS) – predicts Y′ using a Bayesian time‑series model, even without untreated series.
Part Three – Causal Explanation
AB experiments can confirm whether variable X influences outcome Y, while mediation modeling explains *why* X affects Y.
1. Application Scenarios
AB experiments reveal causal direction.
Mediation modeling uncovers the mediating mechanism.
2. Technique
Mediation modeling quantifies the indirect effect of X on Y through a mediator (e.g., driver’s understanding of income influencing support tickets).
3. Case Study
Adding an income‑trend chart to drivers’ UI increased their understanding of earnings, which reduced support tickets. Mediation analysis showed that improved understanding accounted for 19 % of the total treatment effect.
Part Four – Instrumental Variables
When a variable influences the outcome only through the intervention, it can serve as an instrumental variable (IV).
1. Application Scenarios
Identify variables that affect both treatment assignment and outcome but have no direct path to the outcome.
2. Technique – Two‑Stage Least Squares (2SLS)
First stage predicts the endogenous treatment using the IV; second stage estimates the outcome using the predicted treatment.
Part Five – Pre‑Intervention Confounders
Even without AB experiments, confounders that affect both treatment occurrence (T) and outcome (Y) can be addressed using methods such as Propensity Score Matching (PSM), Inverse Probability of Treatment Weighting (IPTW), and Doubly‑Robust Estimation, which combine outcome and propensity models to obtain unbiased ATE estimates.
Summary
The article emphasizes the growing importance of causal‑inference techniques for solving complex business problems, highlights the team’s efforts to embed these methods into the AB platform, and encourages practitioners to adopt advanced causal tools to drive better product decisions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
