Big Data 20 min read

Standardizing Metric Management in Didi’s Data Platform

The article outlines Didi’s end‑to‑end metric lifecycle—from background, requirements and current pain points to a multi‑stage solution that introduces a unified metric dictionary, management tool, logical modeling, and consumption layer—to achieve accurate, timely, consistent, and efficiently managed indicators across the data warehouse ecosystem.

DataFunSummit
DataFunSummit
DataFunSummit
Standardizing Metric Management in Didi’s Data Platform

Metrics are the core products of a data warehouse, measuring business processes and outcomes; a complete metric system supports fine‑grained operations and strategic decision‑making.

The presentation, delivered by Didi expert engineer Zeng Jing, covers five topics: metric‑management background, Didi data‑product overview, metric standardization construction, future plans, and a Q&A session.

Metric‑management background identifies four essential requirements—accuracy, timeliness, consistency, and management efficiency—and describes the current 1.0 workflow where metric definitions are scattered across products, leading to high production cost, low consumption efficiency, inconsistent definitions, and weak governance.

Solution development introduces the 2.0 phase with three pillars:

Metric dictionary: a unified platform for metric entry and definition.

Metric‑management tool: automated naming, authority definition, and strict change‑control processes.

Standardized metric definition: logical modeling that separates metric production from consumption, enabling automatic lineage, cascade updates, and tiered approval.

Logical modeling abstracts physical tables and aggregation methods, supports multiple engines, granularities, and architectures, and allows configuration‑as‑code development, reducing duplicated tables and improving reuse.

Unified consumption builds a three‑layer architecture: a unified query entry (DSL), model selection (based on dimensions, permissions, and time), and model‑level query generation (engine‑SQL and MPP‑SQL). The data‑virtualization layer executes federated queries, pushes aggregation, and supports custom functions.

The approach yields benefits on both production and consumption sides: lower development and maintenance costs, higher data quality, flexible self‑service analysis, and consistent metric usage across dashboards, BI tools, and analytical applications.

Future planning outlines three stages—exploration, expansion, and deepening—focusing on ecosystem integration, efficiency gains, real‑time metric support, and automated production pipelines.

The session concludes with a Q&A addressing multi‑model queries, upstream product admission, and upcoming enhancements.

Big DataStandardizationData ModelingData WarehouseMetric Management
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.