How to Build a Robust Data Metric System: From Atomic to Derived Indicators
This article explains what a metric (indicator) is, distinguishes atomic, derived and composite indicators, outlines the OSM and UJM modeling methods, describes the steps for constructing a metric system, its integration with data warehouses, and details the design and technical implementation of a metric management platform.
What Is a Metric?
A metric, also called an indicator, is a field in a statistical report generated by a data team, such as weekly GMV or a funnel conversion rate. The calculation of a metric typically involves several steps:
Metric calculation logic, e.g., aggregation functions like max(), sum(), count().
Analysis dimensions (store, region, time) which correspond to GROUP BY clauses in SQL.
Business qualifiers or filters, such as payment channel or order type, which correspond to WHERE conditions.
Metrics can be further classified based on these characteristics.
Atomic Metric : Defined on a single business process and a single measurement value, with a specific aggregation logic. Example: total order amount = sum(order_amount). Atomic metrics are conceptual and usually do not correspond to a direct reporting need.
Derived Metric : Directly tied to a reporting requirement and can be calculated from atomic metrics. It reflects a concrete business need.
Composite Metric : Formed by applying logical operations on one or more derived metrics, such as ratios or percentages, and also serves a reporting need.
Methods for Building a Metric System
OSM Model
O stands for Business Objective – the purpose of the product or feature and the problem it solves.
S stands for Business Strategy – the approach taken to achieve the objective.
M stands for Business Measure – how the strategy’s effectiveness is measured. It includes:
KPI: Directly measures strategy effectiveness.
Target: A predefined value used to judge whether expectations are met.
When selecting targets, follow the DUMB principles:
Doable
Understandable
Manageable
Beneficial
UJM Model
The User Journey Map (UJM) is used to align the business objectives, strategies, and measures with the user’s lifecycle stages, ensuring that the metric system meets real user needs.
Scenario‑Based Approach
Scenario‑based design helps modularize and structure the implementation of the metric system, allowing strategic goals to be broken down into actionable, front‑line scenarios.
Pre‑Metric‑System Situation
Inconsistent metric definitions; knowledge is lost when personnel change.
Metrics are tightly coupled, making deprecation and adjustment difficult.
Duplicate calculations across reports increase cluster load.
Lack of traceability for metric data sources hampers troubleshooting.
Data teams spend excessive time on report development rather than model refinement.
Macro Steps for Metric Construction
Organize Metric System Based on Business Purpose
Identify metrics relevant to management‑level financial flows.
Identify order‑fulfillment metrics for the order team.
Identify conversion metrics for advertising.
Support Entry and Management Through a System
Build a platform that enables metric definition entry, lineage tracing, deprecation management, and other lifecycle functions.
Metric Management Position in the Data Warehouse
Metrics are generally computed on top of a star‑schema data warehouse.
Significance of the Metric System for Data‑Warehouse Modeling
Most statistical needs can be expressed using atomic, derived, and composite metrics. When many requirements share the same derived metric, storing these common metrics in the DWS layer reduces duplicate computation and improves data reuse.
Metric Management System Product Modules
Metric Definition
Divided into product configuration and technical configuration.
Select metric category and associated business module.
Provide a description of the metric’s business need.
Write a textual definition of the metric.
Technical configuration: define atomic and derived metrics as basic metrics , and composite metrics as complex metrics .
Basic Metrics
Choose fact tables, dimension tables, and their join relationships (equivalent to FROM and GROUP BY).
Write aggregation logic such as count, sum, avg.
Specify filter conditions (equivalent to WHERE).
Composite Metrics
Select basic metrics, e.g.,
total promotion audienceand
order conversion count.
Write processing logic, e.g.,
order conversion count / total promotion audience.
Metric Query
Search metric definitions and lineage by metric name.
Metric Decommission
Identify unused metrics via lineage analysis and retire them.
External Interface
Expose metric definitions and data to external systems such as BI tools.
Metric Permissions
Control which users or applications can access specific metrics.
Report Configuration
Drag‑and‑drop metrics from the metric pool to quickly build reports.
Technical Implementation of the Metric Management System
Metric Processing Logic
Automatically generate SQL for metric processing from the technical definition and deploy it to the scheduling system.
Metric Output Design
Two common storage approaches:
Wide Table in DWS
Store all user‑level metrics in a large wide table (e.g., recent 5‑day order count, 30‑day order amount). Drawbacks include limited flexibility for report composition, difficulty adding new metrics, and the need for intermediate storage to merge results.
Unified Metric Table
Store all metric definitions in a single table with columns: metric_id, metric_code, metric_value, date_partition.
After each metric job finishes, insert results into this table. To avoid small‑file issues, merge partitions after all jobs complete.
<code>insert overwrite table t_test PARTITION(dt='20240421')
SELECT t.foo, t.bar from t_test t where t.dt='20240421';</code>Overall Mind Map
Data Thinking Notes
Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.