Big Data 8 min read

Trajectory-Based Population Flow Analysis for COVID‑19 Prevention Using HBase and Spark

The article presents a comprehensive big‑data solution that stores massive GPS trajectory records in HBase, processes them with Spark to identify individuals who visited a pandemic source region, and visualizes their spatio‑temporal distribution in target cities to support precise epidemic control measures.

JD Tech Talk
JD Tech Talk
JD Tech Talk
Trajectory-Based Population Flow Analysis for COVID‑19 Prevention Using HBase and Spark

With increasing population mobility, analyzing migration patterns has become crucial for urban planning and especially for responding to public health emergencies such as the COVID‑19 pandemic. The paper uses GPS trajectory data from vehicles and mobile devices to obtain precise location information for large populations.

The problem addressed is to determine the current distribution of people who have visited a virus source region (e.g., Wuhan) during a specific time window, and to map their presence across another city (e.g., Beijing) at the grid and hourly levels.

The proposed solution builds on a prior efficient storage system for massive trajectory data in HBase. It then employs the Spark distributed in‑memory engine to analyze the data, using OID as a unique identifier for each person. The workflow consists of three steps: (1) trajectory data storage with a compressed horizontal format and spatio‑temporal indexing; (2) projection of trajectories onto equal‑sized spatial grids to obtain OID‑grid‑time records, followed by Spark joins to intersect source‑region and target‑city datasets; (3) output of analysis results into HBase tables that support fast queries of individual stay points and aggregated counts per grid per hour.

These results enable authorities to quickly retrieve high‑risk population distributions for heat‑map visualizations, trace individual movement histories, and take timely containment actions. The approach demonstrates how big‑data technologies can provide actionable insights for epidemic prevention and broader demographic and economic studies.

Reference: Li R. et al., “TrajMesa: A Distributed NoSQL Storage Engine for Big Trajectory Data,” ICDE 2020.

Big DataHBaseSpatial AnalysisSparkCOVID-19trajectory datapopulation flow
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.