Trajectory Similarity Measures: Euclidean, DTW, LCSS, EDR, Hausdorff, Fréchet, OWD, and LIP
The article introduces various trajectory similarity metrics—including Euclidean distance, Dynamic Time Warping, Longest Common Sub‑Sequence, Edit Distance on Real sequences, Hausdorff distance, Fréchet distance, One‑Way Distance, and Locality In‑between Polylines—explaining their definitions, advantages, limitations, and visual illustrations for spatio‑temporal data analysis.
In early 2020, the sudden COVID‑19 outbreak drew widespread attention to public health safety and created an urgent need for quickly tracing virus transmission paths and identifying close contacts of confirmed cases.
The JUST platform, built on large‑scale trajectory data, offers a "related‑people query" feature that matches trajectories in space‑time to find individuals who have potentially contacted a confirmed case.
Trajectory data, a type of spatio‑temporal data, is expressed as a sequence of GPS points tr = <p1 → p2 → … → pn> , where each point pi = (lat, lng, t) records latitude, longitude, and timestamp.
Massive trajectory datasets generated by vehicle navigation systems enable applications such as traffic flow analysis, road clustering, and stay‑point detection, but they also pose challenges due to volume, noise, and heterogeneous collection methods.
Measuring the similarity between two trajectories is a foundational algorithmic service that supports these higher‑level applications. The article outlines several common similarity measures:
Euclidean Distance assumes equal‑length trajectories and computes the average point‑wise spatial distance. It is simple but cannot handle differing lengths and is sensitive to noisy points.
Dynamic Time Warping (DTW) aligns two sequences by allowing non‑linear time stretching, enabling many‑to‑many point mapping and handling trajectories of different lengths. However, it does not explicitly filter noise.
Longest Common Sub‑Sequence (LCSS) counts the maximum number of point pairs whose distance is below a threshold, tolerating different lengths and providing some robustness to noise, though the threshold choice can be difficult.
Edit Distance on Real sequence (EDR) measures the minimum number of insert, delete, or replace operations needed to transform one trajectory into another, based on a distance threshold; it is intuitive but sensitive to noise.
Hausdorff Distance is defined as the maximum of the minimum distances from each point of one trajectory to the other, capturing the worst‑case deviation.
Fréchet Distance (the "dog‑leash" distance) measures the minimum leash length required for a person and a dog to walk along two trajectories simultaneously, reflecting both spatial and temporal alignment.
One‑Way Distance (OWD) computes the area between two trajectories divided by the length of the polyline, offering an intuitive measure where larger enclosed area indicates lower similarity.
Locality In‑between Polylines (LIP) aggregates weighted areas between intersecting segments of two trajectories; weights are derived from the proportion of segment perimeter to total length, providing noise resistance.
The JD Urban Spatio‑Temporal Data Engine (JUST) already implements DTW, Hausdorff, and Fréchet distances, allowing users to execute these algorithms with a single SQL statement and visualize results via its web interface.
For more details, visit https://just.urban-computing.cn/ or contact [email protected] .
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.