Mastering Difference-in-Differences: From Theory to Meituan’s Real‑World Cases
This article, part of the Trusted Experiment Whitepaper series, introduces quasi‑experimental design and focuses on the Difference‑in‑Differences (DID) method, explaining its principles, evaluation models, parallel‑trend testing, extensions, and a concrete Meituan fulfillment case study illustrating practical implementation.
This article is the fifth in the Trusted Experiment Whitepaper series. The previous article covered random rotation experiments; this one introduces quasi‑experiments and focuses on the Difference‑in‑Differences (DID) method, including an overview, evaluation principles, and Meituan case studies.
Table of Contents
5.1 Difference-in-Differences
5.1.1 Method Overview
5.1.2 Evaluation Principle
5.1.3 Parallel‑Trend Grouping
5.1.4 Experiment Case
5.2 Extensions and Outlook
5.2.1 DID Extensions
5.2.2 Other Quasi‑Experimental Methods
Quasi‑Experiment
Quasi‑experiments are applicable when the experiment designer can intervene in group assignment but cannot randomly allocate units to treatment and control groups. Classical randomized controlled trials guarantee identical observable and unobservable characteristics across groups through randomization, allowing the difference in outcomes to be attributed to the intervention. When randomization is infeasible, groups may differ before treatment, requiring quasi‑experimental methods under certain assumptions to estimate policy effects accurately.
Challenges in Meituan Fulfillment Scenarios
Factors such as spillover effects and small sample sizes often prevent spatial‑temporal random experiments.
Spillover Effect : The fulfillment business is a multi‑sided platform where units influence each other, violating the Stable Unit Treatment Value Assumption (SUTVA) and causing bias. Geographic isolation (e.g., splitting a city into two half‑cities) can mitigate spillover.
Small Sample : Certain cities have fewer than the required 20 delivery zones, making random grouping impossible.
Strategy and Product Specificity : Some strategies (e.g., delivery‑area optimization) impose constraints that prevent independent random assignment of zones.
5.1 Difference-in-Differences
5.1.1 Method Overview
The basic idea of DID is to estimate the treatment effect (ATT) by subtracting the pre‑treatment difference between treatment and control groups from the post‑treatment difference. This removes inherent group differences.
5.1.2 Evaluation Principle
The traditional DID model can be expressed as:
where i indexes individuals, t indexes time, Y_{it} is the outcome, D_i is the treatment‑group dummy, Post_t is the post‑treatment dummy, and β captures the policy effect.
Regression on panel data yields estimates of the effect, its variance, confidence interval, and minimum detectable effect. For relative impact, a counterfactual mean for the treatment group without the policy is computed.
Fixed‑Effect Models
Adding time and individual fixed effects refines the estimate and reduces variance. The time‑fixed‑effect model:
With both time and individual fixed effects:
When many individuals are present, the model can be estimated via within‑individual differencing to avoid excessive dummy variables.
Parallel‑Trend Assumption Testing
The key assumption for DID is that, absent the policy, the treatment‑control difference would remain constant over time. Graphical checks are coarse; a formal test adds a pre‑trend interaction term and examines its significance. If the coefficient is not statistically different from zero (p > 0.05), the parallel‑trend assumption holds.
5.1.3 Parallel‑Trend Grouping
Randomly split a city into two half‑cities as treatment and control.
Use pre‑experiment data to test parallel trends for all target and guard metrics; score groupings based on test results and difference magnitude.
Repeat steps 1–2 several times and select the highest‑scoring grouping.
Even with careful grouping, risks remain: limited sample size may prevent finding a valid grouping, and external shocks can break the parallel‑trend assumption.
5.1.4 Experiment Case: Delivery‑Area Optimization
Background : Redesign delivery zones to improve efficiency and avoid fragmented merchant heatmaps.
Goal : Reduce orders crossing delivery‑area boundaries and increase delivery efficiency.
Metrics :
Target Metric : xxxx
Guard Metric : xxxx
Challenges : The optimization strategy must keep overall coverage unchanged and zones non‑overlapping, making random assignment infeasible.
Solution : Split the city into two half‑cities, apply the optimization only in the treatment half, and evaluate using DID.
5.2 Extensions and Outlook
5.2.1 DID Extensions
When treatment timing varies across units (staggered adoption), a multi‑time‑point DID model can be used:
Heterogeneous treatment effects can be modeled by interacting the treatment‑post term with group indicators.
Additional observable time‑varying covariates (unaffected by the policy) can be added to improve precision and control for environmental changes:
If the parallel‑trend assumption fails, robust approaches such as Honest DID, conditional parallel‑trend via propensity‑score matching, or moving to a triple‑difference design can be considered.
5.2.2 Other Quasi‑Experimental Methods
Regression Discontinuity Design (RDD) : Uses a cutoff on an observable variable to create treatment and control groups, assuming local randomization around the cutoff.
Interrupted Time Series Analysis (ITSA) : Models pre‑intervention trends (e.g., ARIMA) to predict counterfactual outcomes, then compares post‑intervention observations to these predictions.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
