Debiased Deep Learning and Double Machine Learning for Multi‑Experiment Causal Inference
This article presents a novel approach that combines debiased deep learning with double machine learning to estimate and infer average treatment effects across multiple simultaneous online experiments, detailing problem definition, a semi‑parametric theoretical framework, and extensive field‑experiment validation on a large video‑platform dataset.
The work explores a new methodology that integrates debiased deep learning with double machine learning to address causal inference challenges in environments where many online experiments run concurrently.
Problem definition: Online platforms often run several A/B tests at the same time (e.g., reward button and gift button), creating a combinatorial set of treatment scenarios. The core task is to estimate the average treatment effect (ATE) for any combination and to identify the optimal experiment configuration when not all combinations are observable.
Theoretical framework: Assuming m binary AB experiments, the outcome Y is modeled by a semi‑parametric function Y = G(θ*(x), t), where G is a known link function (chosen as a generalized sigmoid) and θ*(x) is an unknown high‑dimensional representation learned via deep neural networks. The method requires only m+2 observable experiment combos under an overlapping condition.
Estimation procedure: Stage 1 trains a deep neural network (using MSE loss) to estimate θ*(·). Stage 2 applies double machine learning with Neyman orthogonalization and cross‑fitting to debias the plug‑in ATE estimator, achieving a convergence rate of o(n⁻¹/⁴) for the neural network and o(n⁻¹/²) after debiasing.
Empirical validation: A two‑week field experiment on a short‑video platform with three binary features (m = 3) yields eight possible experiment combos and about two million users. Five combos are used for training and three for validation. The Debiased Deep Learning (DeDL) model outperforms linear additive, linear regression, and pure deep learning baselines in both MAPE and MAE metrics, and the debiasing step provides substantial performance gains when the neural network is well‑trained.
Conclusions: The DeDL method is both theoretically justified and practically effective for large‑scale causal inference in online experimentation; the code has been open‑sourced, and careful selection of the link function G is essential for reliable results.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
