Databases 13 min read

How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, ETA and Monitoring

This article explains how Didi leverages HBase’s distributed architecture, multi‑language APIs, and custom rowkey designs to support online order queries, driver‑passenger trajectory tracking with GeoHash, real‑time ETA calculations, and a monitoring platform, while managing multi‑tenant resources through DHS and RS Group.

21CTO

Jun 19, 2017

How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, ETA and Monitoring

Background

Business Types – HBase, built on the Hadoop ecosystem, serves both offline batch jobs (e.g., daily reports, security analysis, model training) and online services that require low‑latency random access such as order and customer‑service queries.

Multi‑Language Support – Didi provides Java native API, Thrift Server (C++, PHP, Python), Phoenix JDBC, Phoenix QueryServer, MapReduce, Spark, and Streaming interfaces to accommodate diverse development preferences.

Data Types

Statistical and report data – small volume, flexible SQL queries via Phoenix.

Raw factual data – orders, GPS traces, logs; large volume, high consistency, low latency.

Intermediate results – model‑training inputs; large volume, high throughput.

Backup data – HBase used as an off‑site disaster‑recovery store.

Use‑Case Introduction

Scenario 1: Order Events

Requirements include online order‑lifecycle queries, historical order detail look‑ups, offline order‑status analysis, and handling 10 K writes/sec and 1 K reads/sec with 5 s data freshness.

Order Status Table

Rowkey: reverse(order_id) + (MAX_LONG - TS) Columns: various order states.

Order History Table

Rowkey: reverse(passenger_id|driver_id) + (MAX_LONG - TS) Columns: orders and related info per user within a time range.

Scenario 2: Driver‑Passenger Trajectory

Supports real‑time or near‑real‑time coordinate queries, large‑scale offline analysis, and geographic range queries. GeoHash converts latitude/longitude into strings representing rectangular areas, enabling coarse‑grained indexing while preserving privacy.

Because GeoHash blocks may not perfectly match circular query areas, a second‑stage filter checks the actual distance between GPS points and the query centre.

Rowkey designs:

Single‑user query: reverse(user_id) + (Integer.MAX_LONG‑TS/1000) Range query:

reverse(geohash) + ts/1000 + user_id

Scenario 3: ETA

ETA (estimated time of arrival) originally offline, now real‑time via HBase as a key‑value cache, reducing training time, supporting multi‑city parallelism, and minimizing manual intervention.

Model training with Spark every 30 minutes per city.

First stage reads all city data from HBase within 5 minutes.

Second stage completes ETA calculation within 25 minutes.

HBase data periodically persisted to HDFS for new model testing and feature extraction.

Rowkey: salting + city + type0 + type1 + type2 + TS Columns: order, feature.

Scenario 4: Monitoring Tool DCM

DCM monitors Hadoop cluster resources (NameNode, Yarn containers) and stores metrics in HBase via Phoenix, enabling second‑level query responses and front‑end dashboards.

Didi’s Multi‑Tenant Management on HBase

Didi treats a single HBase cluster with multiple tenants as the most efficient solution, but HBase lacks built‑in multi‑tenant controls. Challenges include resource visibility, project lifecycle management, and contention.

The Didi HBase Service (DHS) platform provides project lifecycle management, permission control, cluster resource allocation, and table‑level monitoring (read/write rates, memstore, block cache, locality). Users register projects, estimate resource needs, and receive a project overview page.

Using RS Group, the cluster is divided into logical sub‑clusters, allowing exclusive or shared resource pools. Table 1 (omitted) compares pros and cons of shared vs. exclusive resources.

Resource allocation strategy:

Low‑latency, low‑volume, low‑availability data → shared pool.

Latency‑sensitive, high‑throughput, high‑availability online services → dedicated RegionServer Group with 20‑30% headroom.

Periodic usage accounting generates billing for tenants.

RS Group

RS Group assigns a specific list of RegionServers to a group; tables are mounted to groups, and failures within a group do not cause region migration to other groups, achieving logical isolation and reducing management overhead.

Conclusion

Successful HBase adoption at Didi hinges on two key factors: guiding users to design effective table schemas and controlling resource allocation. Clear architecture knowledge, proactive platform support, and appropriate isolation (shared vs. exclusive) lower failure risk, reduce operational costs, and create a virtuous cycle that improves user experience and business growth.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

real-time analytics HBase Multi‑tenant Distributed storage GeoHash Didi Rowkey Design

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Use‑Case Introduction

Scenario 1: Order Events

Scenario 2: Driver‑Passenger Trajectory

Scenario 3: ETA

Scenario 4: Monitoring Tool DCM

Didi’s Multi‑Tenant Management on HBase

RS Group

Conclusion

21CTO

How this landed with the community

Was this worth your time?

0 Comments

Scenario 1: Order Events

Scenario 2: Driver‑Passenger Trajectory

Scenario 3: ETA

Scenario 4: Monitoring Tool DCM