Artificial Intelligence 19 min read

How to Estimate Long‑Term Heterogeneous Dose‑Response Curves with Unobserved Confounding

This article presents a data‑fusion framework that combines long‑term observational data and short‑term randomized experiments to identify and estimate long‑term heterogeneous dose‑response curves under continuous treatments and unobserved confounders, using reweighting, optimal transport, and balanced representation learning.

Didi Tech

Jul 24, 2025

How to Estimate Long‑Term Heterogeneous Dose‑Response Curves with Unobserved Confounding

Background

Recent advances in causal inference from observational data have enabled subsidy pricing and similar applications, where the goal is to evaluate the treatment effect of a variable (e.g., coupon amount) on an outcome (e.g., GMV). Short‑term treatment effects can be estimated with mature models such as DML, GRF, and CFR, but existing methods struggle with long‑term effects in complex commercial settings because they rely on ideal assumptions (binary treatment or no unobserved confounders) that are often violated.

Problem Setting

We consider a one‑dimensional continuous treatment A , observable confounders X , unobservable confounders U , short‑term potential outcomes S(a) , and long‑term potential outcomes Y(a) . The data consist of a large historical observational set (containing A , X , S , and Y ) and a small randomized experiment set (containing A , X , and S ). The target is the long‑term heterogeneous dose‑response curve (Long‑term HDRC) Y(a) for each individual.

Challenges

Identifiability of the target estimand : Unobserved confounders in the observational data break the standard identifiability assumptions.

Generalization error of counterfactual estimation : Continuous treatments create a huge counterfactual space, making it difficult for a model trained only on factual data to generalize.

Theoretical Framework

We introduce a reweighting scheme that makes the long‑term HDRC identifiable under six standard causal assumptions (SUTVA, Overlap, and four additional assumptions about independence given observed and unobserved variables). Proposition 1 shows that, with appropriate weights w_i for observational samples and uniform weights for experimental samples, the weighted observational distribution satisfies the required conditional independence.

Using Proposition 1, Theorem 1 proves that the weighted distribution yields an identifiable Long‑term HDRC.

To learn the weights efficiently, we formulate a conditional optimal transport (OT) problem and prove Theorem 2 that the conditional OT distance can be bounded by an OT distance on the joint distribution, allowing us to replace the intractable conditional OT with a tractable joint OT.

Theorem 3 further shows that solving a mini‑batch OT problem (m‑OT) provides an upper bound on the full‑sample OT distance, making the computation compatible with stochastic deep‑learning pipelines.

Finally, Theorem 4 derives a generalization bound for counterfactual error under the reweighted distribution, revealing that the error can be decomposed into the factual error plus a balanced‑representation IPM term that measures dependence between treatment A and representation Z .

Model: LEARN

Based on the theory, we propose LEARN (Long‑term hEterogeneous dose‑response curve estimAtor with Reweighting and represeNtation learning), which consists of three modules:

OT‑weighting module : Learns sample weights by minimizing the mini‑batch OT distance between the weighted observational batch and the full experimental set, with an entropy regularizer. The optimization is performed via projected gradient descent.

Balanced‑representation module : Learns a representation Z of the observable confounders X using an MLP and minimizes a Wasserstein IPM between the treatment‑conditioned distributions of Z , thereby reducing observable confounding bias.

Long‑term estimation module : Predicts long‑term outcomes Y(a) from short‑term representations. A GRU encodes the sequence of short‑term representations, a shared MLP predicts short‑term outcomes, an attention mechanism aggregates the sequence into a long‑term representation, and a final MLP outputs the long‑term prediction. Continuous treatments are handled via a variable‑coefficient architecture.

The overall loss combines the weighted factual loss, the OT regularization term, and the balanced‑representation IPM term (see the loss figure).

Evaluation

We conduct extensive experiments on synthetic data, semi‑synthetic data (News, TCGA), and real‑world Didi pricing data. The primary metric is the Mean Integrated Squared Error (MISE) of the estimated Long‑term HDRC on a held‑out test set.

Results show that LEARN consistently outperforms baselines, achieving lower MISE on all datasets. In synthetic experiments, the OT‑weighting module removes ~70% of unobserved confounding bias, while the balanced‑representation module eliminates ~83% of observable bias (measured by HSIC). Real‑world A/B tests confirm significant improvements in GMV, call volume, and TSH.

Conclusion

We present a complete solution for estimating long‑term heterogeneous dose‑response curves under continuous treatments and unobserved confounding by combining reweighting, optimal transport, and balanced representation learning. Theoretical guarantees and empirical results demonstrate the effectiveness of the proposed framework.

References

Yang Z, Chen W, Cai R, et al. Estimating long‑term heterogeneous dose‑response curve: Generalization bound leveraging optimal transport weights. arXiv preprint arXiv:2406.19195, 2024.

Athey S, Chetty R, Imbens G. Combining experimental and observational data to estimate treatment effects on long term outcomes. arXiv preprint arXiv:2006.09676, 2020.

Yang Y, Gu X, Sun J. Prototypical partial optimal transport for universal domain adaptation. AAAI 2023.

Yan Y, Yang Z, Chen W, et al. Exploiting geometry for treatment effect estimation via optimal transport. AAAI 2024.

Cheng L, Guo R, Liu H. Long‑term effect estimation with surrogate representation. WSDM 2021.

Nie L, Ye M, Nicolae D. VCNet and Functional Targeted Regularization For Learning Causal Effects of Continuous Treatments. ICLR.

machine learning causal inference treatment effect dose-response optimal transport

Written by

Didi Tech

Official Didi technology account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.