Pedestrian Trajectory Prediction: Methodology and Experience from the ICRA 2020 TrajNet++ Competition
The ICRA 2020 TrajNet++ competition challenged teams to predict 4.8‑second pedestrian paths from 3.6‑second observations, and Meituan’s winning solution used a Seq2Seq world‑model that encodes past trajectories, updates a spatio‑temporal interaction map, and decodes future positions, achieving a 1.24 m final displacement error and demonstrating readiness for real‑world unmanned delivery.
Pedestrian trajectory prediction is a crucial component of autonomous driving and has become a hot research topic in recent years. At the ICRA 2020 conference, Meituan’s unmanned delivery team won first place among more than one hundred teams in the pedestrian trajectory prediction competition.
Background
On June 2, 2020, the second Long-Term Human Motion Prediction Workshop was held at ICRA 2020, jointly organized by Bosch, Erlangen University, Stuttgart University, and ETH Zurich. The competition provided trajectory datasets from ten complex scenes (streets, entrances, campuses, etc.) and required participants to predict future trajectories for the next 4.8 seconds based on the past 3.6 seconds of observation. Ranking was based on Final Displacement Error (FDE), the Euclidean distance between predicted and ground‑truth endpoints.
Competition Details
The dataset contains trajectories sampled at 2.5 Hz (0.4 s interval) and classifies each trajectory into categories such as static obstacle, linear motion, following, avoidance, and group movement. Participants receive nine historical timesteps (3.6 s) and must predict twelve future timesteps (4.8 s). Both single‑modal (one deterministic trajectory) and multi‑modal (multiple plausible trajectories) metrics are used, but the final ranking relies on the single‑modal FDE.
Method Overview
Meituan’s approach builds a global “world model” that captures interactions among all agents and between agents and the environment. The model consists of three modules within a Seq2Seq framework:
Encoder: encodes each pedestrian’s past trajectory.
Interaction Module (World Model): maintains and updates a spatio‑temporal map of the scene.
Decoder: predicts future trajectories based on encoded features and interaction cues.
During encoding, the world model is updated (Update) with the latest LSTM hidden states and positional information, producing a global feature R via MLP, MaxPooling, and GRU. Each pedestrian’s current observation is then combined with R through an attention mechanism, and the result is fed back into the LSTM.
The decoding stage mirrors the encoding process but differs in two ways: (1) the LSTM is initialized with the final hidden state from the encoder plus noise, and (2) the decoder uses the previously predicted positions rather than the ground‑truth observations.
Data Pre‑processing & Post‑processing
To reduce the domain gap between training and test sets, two preprocessing steps were applied: balanced sampling and scene normalization (trajectory interpolation, centering, and random rotation). Post‑processing involved trajectory clipping and non‑maximum suppression to eliminate implausible predictions that cross scene boundaries.
Training employed K‑Fold cross‑validation and grid search for hyper‑parameter tuning. The final model achieved an FDE of 1.24 m on the test set, while the second‑place method recorded 1.30 m.
Conclusion
Pedestrian trajectory prediction remains a vibrant research area. The world‑model‑based approach demonstrated strong performance on a competitive benchmark and shows promise for real‑world deployment in Meituan’s unmanned delivery services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
