Artificial Intelligence 17 min read

Causal Inference and Tree‑Based Uplift Modeling for Intelligent Subsidy in Ride‑Sharing Services

The paper applies causal inference and tree‑based uplift modeling to identify coupon‑responsive riders, using T‑, S‑, and X‑Learners as well as a proprietary Treelift model that directly optimizes per‑user utility, achieving a 4.7% lift over manual rules and 2.3% over prior response models.

HelloTech
HelloTech
HelloTech
Causal Inference and Tree‑Based Uplift Modeling for Intelligent Subsidy in Ride‑Sharing Services

This document introduces the application of causal inference techniques, especially uplift (incremental) modeling, to improve the efficiency of intelligent subsidy (coupon) allocation in the hotel‑marketing scenario of a ride‑sharing platform.

Background : The platform aims to maximize total utility by issuing coupons that convert users who would not otherwise purchase. Traditional rule‑based or CTR (response) models estimate purchase probability but cannot directly measure the causal effect of a coupon.

Problem Statement : Identify the marketing‑sensitive user segment (those who only purchase when offered a coupon) and allocate subsidies to maximize per‑user utility while minimizing waste.

Uplift Modeling : The core idea is to predict the causal effect of the intervention (coupon) on each user, i.e., the difference between outcomes with and without the coupon. Several uplift learning strategies are described:

T‑Learner – trains separate models for treatment and control groups and takes the difference of predictions.

S‑Learner – trains a single model with a treatment indicator as a feature.

X‑Learner – combines the two previous ideas and uses cross‑prediction to estimate counterfactual outcomes.

These methods are indirect because they compute the uplift as a difference of two predictions. Direct approaches, such as tree‑based uplift models (including the proprietary Treelift model) and deep learning methods like DragonNet, are also mentioned.

Tree‑Based Uplift Model : By modifying the split criterion of decision trees to align with the business goal (maximizing the squared difference of per‑user utility between treatment and control), the model directly optimizes the target metric. The algorithm proceeds as follows:

Compute the pre‑split utility difference between treatment and control groups.

For each candidate feature value, split the data and compute the weighted post‑split utility difference.

Calculate the gain as post‑split minus pre‑split utility difference.

Select the split with the highest gain and recurse on the resulting child nodes.

Evaluation : Because true uplift labels are unavailable, the offline metric AUUC (Area Under the Uplift Curve) is used. The process involves scoring the test set, sorting by uplift score, binning, and integrating the incremental conversion gain across bins.

Experimental Setup : Small‑traffic random experiments were conducted to collect unbiased treatment/control data. The data were used to train various uplift models (T‑Learner, S‑Learner, Tree‑based, Treelift). Offline AUUC results showed the Treelift model achieving the highest score.

Online Results : Deploying the Treelift model in the real‑time subsidy decision pipeline yielded a 4.7% lift over manual rules and a 2.3% improvement over the previous response model.

Future Work : Address the high training cost of tree models, explore pruning and regularization techniques, investigate deep‑learning uplift methods, improve data efficiency via propensity‑score matching, and consider cost‑aware optimization (e.g., integer programming for coupon budget allocation).

AImachine learningcausal inferenceuplift modelingMarketing Optimizationtree-based models
HelloTech
Written by

HelloTech

Official Hello technology account, sharing tech insights and developments.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.