E‑commerce Metric Management Practice Based on DataLeap
This article details ByteDance Volcano Engine's DataLeap‑driven e‑commerce metric management practice, covering the background of metric system construction, challenges of inconsistency, the six‑point platform solution, metric DSL query language, consumption workflow, and future plans for intelligent automation and large‑model integration.
The presentation outlines the development of an e‑commerce metric system, beginning with an overview of the business context and the need for a unified metric platform to support diverse internal and external data consumers.
Three major obstacles are identified: inconsistent metric naming, unclear metric definitions, and fragmented metric consumption, which hinder data quality and efficiency.
To address these, a six‑point platform design is proposed, emphasizing consistency, standardized production and management processes, comprehensive asset management, unified metric consumption metadata, consistent downstream services, and extensible consumption capabilities.
The architecture includes a three‑layer data warehouse (ODS, public, application) and a metric definition layer that builds atomic metrics using basic elements (domain, time period, modifiers, etc.) and derives composite metrics. MetricDSL is introduced as a unified query language supporting OLAP, detail, and custom function queries across models and data sources.
MetricDSL enables flexible, multi‑metric, cross‑model queries.
Execution flow involves metadata retrieval, metric decomposition, SQL generation, and in‑memory calculations.
Consumption is streamlined through a metric dictionary, routing mechanisms, and unified services, allowing downstream users to select metrics directly without worrying about underlying tables, improving consistency and reducing development effort.
Future plans focus on intelligent automation: auto‑generating materialized views from query logs, semantic model inference, automated metric decomposition using large language models, and leveraging LLMs to translate natural language queries into SQL, enhancing performance and usability.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.