Pedestrian Trajectory Prediction: Methodology and Experience from the ICRA 2020 TrajNet++ Competition

The ICRA 2020 TrajNet++ competition challenged teams to predict 4.8‑second pedestrian paths from 3.6‑second observations, and Meituan’s winning solution used a Seq2Seq world‑model that encodes past trajectories, updates a spatio‑temporal interaction map, and decodes future positions, achieving a 1.24 m final displacement error and demonstrating readiness for real‑world unmanned delivery.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
Pedestrian Trajectory Prediction: Methodology and Experience from the ICRA 2020 TrajNet++ Competition

Pedestrian trajectory prediction is a crucial component of autonomous driving and has become a hot research topic in recent years. At the ICRA 2020 conference, Meituan’s unmanned delivery team won first place among more than one hundred teams in the pedestrian trajectory prediction competition.

Background

On June 2, 2020, the second Long-Term Human Motion Prediction Workshop was held at ICRA 2020, jointly organized by Bosch, Erlangen University, Stuttgart University, and ETH Zurich. The competition provided trajectory datasets from ten complex scenes (streets, entrances, campuses, etc.) and required participants to predict future trajectories for the next 4.8 seconds based on the past 3.6 seconds of observation. Ranking was based on Final Displacement Error (FDE), the Euclidean distance between predicted and ground‑truth endpoints.

ICRA 2020 TrajNet++ competition
ICRA 2020 TrajNet++ competition

Competition Details

The dataset contains trajectories sampled at 2.5 Hz (0.4 s interval) and classifies each trajectory into categories such as static obstacle, linear motion, following, avoidance, and group movement. Participants receive nine historical timesteps (3.6 s) and must predict twelve future timesteps (4.8 s). Both single‑modal (one deterministic trajectory) and multi‑modal (multiple plausible trajectories) metrics are used, but the final ranking relies on the single‑modal FDE.

Method Overview

Meituan’s approach builds a global “world model” that captures interactions among all agents and between agents and the environment. The model consists of three modules within a Seq2Seq framework:

Encoder: encodes each pedestrian’s past trajectory.

Interaction Module (World Model): maintains and updates a spatio‑temporal map of the scene.

Decoder: predicts future trajectories based on encoded features and interaction cues.

During encoding, the world model is updated (Update) with the latest LSTM hidden states and positional information, producing a global feature R via MLP, MaxPooling, and GRU. Each pedestrian’s current observation is then combined with R through an attention mechanism, and the result is fed back into the LSTM.

World Model Architecture
World Model Architecture

The decoding stage mirrors the encoding process but differs in two ways: (1) the LSTM is initialized with the final hidden state from the encoder plus noise, and (2) the decoder uses the previously predicted positions rather than the ground‑truth observations.

Decoding Phase
Decoding Phase

Data Pre‑processing & Post‑processing

To reduce the domain gap between training and test sets, two preprocessing steps were applied: balanced sampling and scene normalization (trajectory interpolation, centering, and random rotation). Post‑processing involved trajectory clipping and non‑maximum suppression to eliminate implausible predictions that cross scene boundaries.

Training vs. Test Trajectory Distribution
Training vs. Test Trajectory Distribution

Training employed K‑Fold cross‑validation and grid search for hyper‑parameter tuning. The final model achieved an FDE of 1.24 m on the test set, while the second‑place method recorded 1.30 m.

Conclusion

Pedestrian trajectory prediction remains a vibrant research area. The world‑model‑based approach demonstrated strong performance on a competitive benchmark and shows promise for real‑world deployment in Meituan’s unmanned delivery services.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

machine learningAIPredictioninteraction modelingworld modelpedestrian trajectoryICRA 2020
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.