Best Practices for Building an International Ride‑Hailing Data Metric System at Didi
This article presents Didi's comprehensive approach to designing, implementing, and governing a global data metric system for international ride‑hailing, covering business scenarios, metric‑related challenges, organizational structures, process flows, model architecture, time‑zone handling, tooling, and multi‑level governance.
Didi's international ride‑hailing business operates across five continents and 15 time zones, serving markets with vastly different regulations and competition, which necessitates a unified global data metric system.
The presentation outlines three main topics: the international ride‑hailing business scenario, the pain points of metric construction in such a scenario, and the proposed metric‑building solution.
Key pain points include difficulty defining metrics due to diverse market needs, technical challenges of producing metrics in local time zones, management difficulties caused by fragmented metric‑management processes, evaluation challenges when metrics change, and high assurance costs for multi‑time‑zone data production.
The solution is structured around five dimensions: establishing an organization, defining a standardized process, designing a robust model, deploying metric‑management tools, and implementing comprehensive governance.
Organizationally, the metric production team comprises data analysts (who define global metrics), data product owners (who manage metric catalogs), data developers (who design models and develop metrics), and data platform engineers (who build a one‑stop development and management platform).
The process consists of three stages: metric conversion (collecting requirements and defining metric specifications), metric development (producing and monitoring metric data), and metric delivery/acceptance (validating metric relevance and dashboard suitability).
Model design follows a top‑down architecture: separating business domains, extracting atomic metrics, deriving composite metrics, and ensuring metric uniqueness across domains to avoid duplicate definitions.
The metric model is divided into five layers: source layer (raw business and log data), fact‑detail layer (cleaned, domain‑specific atomic metrics), core processing layer (rapid metric assembly), thematic analysis layer (cross‑domain aggregation), and application layer (dashboards, APIs, feature platforms).
For multi‑time‑zone support, a SDK‑based time‑zone service records local‑time offsets, and an hour‑to‑day conversion tool transforms hourly Beijing‑time data into local‑day partitions, enabling a single data pipeline to serve all regions.
The metric‑management tool provides an end‑to‑end workflow from demand collection, metric cataloging, development, to production, ensuring standardization, lineage tracking, and integration with data‑development, dashboard, and modeling tools.
Governance is organized into three quality dimensions—accuracy, timeliness, and historical completeness—and applies tiered monitoring (T1, T2, T3) with appropriate alerting mechanisms to maintain metric reliability at scale.
The article concludes with a summary of the shared practices and thanks the audience.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.