Iterative Development of Delivery Time Estimation Models: Tree Model, Vector Retrieval, and End‑to‑End Deep Learning

The paper chronicles Meituan’s three‑stage evolution of delivery‑time estimation—from a hierarchical address tree with local linear regression, through a vector‑retrieval system that boosts recall, to a lightweight end‑to‑end deep‑learning model that meets sub‑5 ms latency while delivering progressively lower error and full coverage.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
Iterative Development of Delivery Time Estimation Models: Tree Model, Vector Retrieval, and End‑to‑End Deep Learning

The article introduces three successive versions of delivery‑time estimation models used by Meituan’s delivery platform: a tree‑based model built on address hierarchy, a vector‑retrieval solution, and a lightweight end‑to‑end deep‑learning network. It discusses the trade‑offs between performance and prediction metrics and shares the evolution of model strategies to inspire practitioners.

1. Background

Accurate delivery‑time estimation is critical for Meituan’s on‑demand food delivery. The process involves dispatching millions of orders to hundreds of thousands of couriers in real time, and the delivery interval (from the courier arriving near the user to handing over the food) is a key factor. Challenges include sparse non‑numeric input features, strict latency requirements (average <5 ms, TP99 ≈10 ms), and diverse address‑related difficulties (multi‑user buildings, elevators, walking distance, etc.).

2. Technical Iteration Path

2.1 Tree Model

• Technical choice : Use a hierarchical address tree (addr, building, unit, floor) to aggregate data and apply local linear regression on floor level. Missing address levels fall back to higher‑level predictions or regional averages.

• Iteration path : Split nodes when data volume exceeds a threshold and the sum of MAE after split is lower than before. Weighted linear regression emphasizes recent data.

2.2 Tree Model + Vector Retrieval

• Technical choice : Replace exact address matching with high‑dimensional vector similarity (Word2Vec embeddings of address tokens combined with GPS information). This improves recall for addresses that the tree model cannot handle.

• Result : Recall rate increased by 12.20 pp; ME ↓ 87.14 s, MAE ↓ 38.13 s; 1‑minute absolute deviation ↓ 14.01 pp, 2‑minute ↓ 18.45 pp, 3‑minute ↓ 15.90 pp.

2.3 End‑to‑End Lightweight Deep Learning

• Technical choice : An end‑to‑end model consumes raw character‑level address strings, GPS coordinates, order time, city/region IDs, etc. It uses a robust LSTM for address encoding, a custom bilinear embedding for GPS, and low‑flop fusion layers to stay within the 5 ms latency budget.

• Engineering considerations : The model must not increase CPU inference time; therefore, lightweight ops (e.g., LSTMBlockFusedCell) are preferred over heavier alternatives.

• Result : Coverage reaches 100 %; ME ↓ 4.96 s, MAE ↓ 8.17 s; 1‑minute deviation ↓ 2.38 pp, 2‑minute ↓ 5.08 pp, 3‑minute ↓ 3.46 pp.

3. Model‑Related Analysis

3.1 Vector Retrieval Performance

Nearest‑Neighbor Search (NNS) is required for high‑dimensional vectors. Approximate NNS (ANN) methods such as LSH, PQ, and tree‑based indexes are evaluated. Faiss is selected as the benchmark tool for its speed, memory efficiency, and GPU support.

3.2 Sequence Module Performance

TensorFlow profiling shows the sequence module (LSTM/GRU) dominates runtime. Various LSTM implementations are compared; LSTMBlockFusedCell provides the best performance on CPU, while FullyConnected is fastest but degrades accuracy.

3.3 Vector Effect Analysis

Comparisons between the end‑to‑end model’s char embeddings and Word2Vec embeddings reveal that the performance gain mainly stems from leveraging additional address information (redundant parts) and extra features rather than superior embedding quality.

4. Summary and Outlook

The three‑stage model evolution demonstrates how lightweight solutions can meet strict latency constraints while progressively improving recall and accuracy. The study also offers insights into vector retrieval, deep‑learning op selection, and feature importance for logistics‑related machine‑learning tasks.

5. Related Reading

Links to further articles on ETA estimation, large‑scale feature construction, and AI techniques behind Meituan delivery are provided.

6. Author Information

Ji Ze – Technical Expert, Meituan‑Dianping; Yan Cong – Algorithm Engineer, Meituan‑Dianping.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance Optimizationmachine learningDeep LearningLogisticsVector Retrievaldelivery time estimation
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.