Efficient Detection of Suspected Infected Crowds Using Spatio‑Temporal Trajectory Analysis
The article presents a novel spatio‑temporal trajectory risk metric and an efficient query framework, including a space‑first index (SFT) and distributed storage, to identify high‑risk contacts of COVID‑19 patients from massive movement data, with real‑world deployment results in several Chinese cities.
Background COVID‑19 has caused a global health crisis; early detection, reporting, isolation, and treatment are the most effective control measures. Traditional contact tracing relies on manual interviews or ticket records, which are incomplete and memory‑dependent.
Problem Statement With the proliferation of mobile devices, massive spatio‑temporal data can be leveraged to trace patient movements and identify close contacts. Existing trajectory similarity measures (DTW, EDR, Hausdorff) are unsuitable for fine‑grained exposure detection, and exhaustive city‑wide queries are computationally prohibitive.
Proposed Solution The authors introduce a novel trajectory risk metric that splits a patient’s path into propagation sub‑segments, evaluates contact probability for each segment, and aggregates these probabilities to estimate infection risk. A space‑first spatio‑temporal index (SFT) clusters nearby segments to accelerate queries.
System Architecture The solution consists of three modules: data preprocessing, candidate extraction, and infection risk exploration. Data preprocessing filters noise, segments trajectories, and stores segments in a NoSQL store using the XZ2T index. Candidate extraction uses the SFT index to retrieve relevant trajectory fragments with low I/O overhead. Infection risk exploration prunes unlikely candidates, computes per‑segment infection probabilities, aggregates them, and filters low‑risk trajectories.
Data Preprocessing Trajectory noise is removed, then trajectories are partitioned based on temporal (e.g., >30 min) or spatial (e.g., >50 m) gaps. Segments are indexed with XZ2T, which combines fixed‑interval time partitions with a 2‑D spatial index mapped to a 1‑D integer key, and stored in a distributed database such as HBase.
Candidate Extraction The SFT index clusters segments with similar spatio‑temporal ranges using a quadtree, and each leaf node maintains a 1‑D R‑tree for time. This design reduces I/O and redundancy when querying large volumes of trajectory data.
Infection Risk Computation Each trajectory point l = (p, t) defines a spatio‑temporal influence region STR(l, θ_d, θ_t). The contact degree between a candidate and the patient is calculated using a weighted combination of spatial distance and temporal interval, controlled by a parameter λ∈[0,1]. Local segment risk is summed across all segments, with longer dwell times receiving higher weights.
Pruning Strategies Four pruning techniques are employed to limit the computational cost of evaluating every spatio‑temporal point against all candidates (details in the referenced paper).
Case Study The system was deployed in Beijing, Wuhan, Guangzhou, Nanjing, Chengdu and other cities. It identified over 500 high‑risk contacts in Beijing and a quarter of confirmed cases in Suqian, demonstrating the practical impact of the approach.
References He, H. et al. (2020). Efficient Suspected Infected Crowds Detection Based on Spatio‑Temporal Trajectories. Additional URLs: http://just.urban‑computing.com/ and related articles.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.