Big Data 22 min read

Origin Data Governance Platform: Architecture, Modules, and Implementation at Meituan

The article describes Meituan's Origin Data Governance Platform, detailing its background, challenges, architectural redesign, core modules such as data storage, metadata, business, security, and application management, as well as its internal workflow, achievements, and future roadmap for unified, secure, and high‑performance data services.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Origin Data Governance Platform: Architecture, Modules, and Implementation at Meituan

Background

Meituan, a highly digitalized and technology‑driven company, places great importance on extracting value from data. Over the years, its hotel‑travel division built a comprehensive solution consisting of a data warehouse and various data platforms (self‑service reporting, professional analytics, CRM, performance assessment, etc.) to eliminate data silos and support diverse analytical needs.

While the early architecture (Figure 1) efficiently met business demands, long‑term use revealed inconsistencies in metric definitions, calculation logic, and data sources, leading to low trust in indicator data and hampering decision‑making.

Challenges

The data‑governance project faced four main challenges: (1) determining where the governance platform should be inserted in the architecture with minimal intrusion, (2) designing a concise, efficient management process for unified metric and dimension information, (3) integrating various storage engines to provide a high‑concurrency, high‑availability data outlet, and (4) ensuring data security across business lines.

Solution Approach

The platform was positioned between the data‑warehouse (or data‑mart) layer and the data‑application layer, acting as a bridge that enforces rules, makes interactions queryable and monitorable, and transforms chaotic exchanges into orderly processes (Figure 2).

Platform Architecture

The platform consists of several functional modules—data storage, query, cache, metadata management, business management, security management, application management, and external APIs—organized to reduce development difficulty and improve maintainability (Figure 3).

Data Storage

The platform manages data in the Topic layer of the warehouse and the application layer, supporting Hive, MySQL, Kylin, Palo, Elasticsearch, and Druid (Figure 4). Storage decisions are made by data engineers based on space, query performance, and model organization, while the platform oversees metadata, monitoring, and alerts.

Metadata Management

Metadata is split into business metadata (metric and dimension definitions) and data metadata (table, model, and field bindings). Four sub‑modules—table management, model management, metric management, and dimension management—handle creation, maintenance, and governance of these assets.

Table Management

Manages database connections, table schemas, types (fact or dimension), usage, ETL links, owners, recommendation scores, monitoring configurations, and sample data.

Model Management

Captures table relationships (join types), ER diagrams, field‑to‑dimension bindings, and metric‑to‑model bindings, supporting star/snowflake schemas and OLAP (MOLAP/ROLAP) models (Figures 5‑6).

Dimension Management

Separates business information (name, definition, classification) from technical details (whether a dimension table exists, date dimension flag, code/name mappings). Supports both enumerated dimensions and dimension tables.

Metric Management

Collects business attributes (name, classification, frequency, precision, unit, definition, calculation logic, analysis method) and technical attributes (data type, code, model bindings, virtual model creation, monitoring thresholds). Also tracks related metrics and applications for impact analysis (Figure 7).

Business Management

Divided into business‑line management, theme management, and ticket (work‑order) management, ensuring proper permissions, resource isolation, and traceability of data‑processing requests.

Business‑Line & Theme Management

Controls visibility of metrics, dimensions, tables, and models per business line, with role‑based access (regular user vs. admin) and multi‑level review for new metrics.

Ticket Management

Standard workflow for requesting, reviewing, developing, and approving metric‑dimension and model changes, with automatic logging for auditability (Figure 8).

Security Management

Provides platform‑operation permission control integrated with the corporate “General Order” system and API‑call permission management, covering page access, business‑line/data‑line user rights, and application‑level rights, along with approval and audit modules.

Application Management

Consists of data applications, external applications, and a data map, recording relationships among metrics, dimensions, models, tables, and external services, and enabling query services, ETL production, and API exposure (Figure 9).

External APIs

Expose metadata (metric, dimension, table, model information), data (query services with aggregation, comparative analysis, cross‑engine support), and monitoring/statistics to downstream systems, ensuring consistency and reliability.

Internal Working Principle

The platform maps business metric/dimension information to data‑model calculations, dynamically generates optimal SQL or query statements, and executes them via a distributed query engine built on Akka Cluster, Redis‑backed task queues, load‑balanced workers, and automatic degradation/monitoring (Figure 11).

Management Process

Roles include business owners and data engineers (RD). Business owners maintain metric business information, engineers create tables, models, and bind metrics, then build data applications for end‑users (Figure 12).

Results

The platform has been deployed to support more than ten data platforms within the hotel‑travel division, achieving unified metric and dimension definitions, a single data export point, unified monitoring and alerting, flexible query capabilities, data‑lineage visualization, and provenance analysis.

Future Outlook

As part of the broader “Tian‑Gong” ecosystem (including a universal reporting system and a data‑query system), the platform aims to provide plug‑and‑play standards for metadata, query, and visualization components, enabling modular expansion and faster service development (Figure 13).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

platform architecturemetadata managementMeituan
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.