How Meituan Built DataMan: A Scalable Data Quality Monitoring Platform for Big Data
This article details Meituan's DataMan platform, describing the background of data quality challenges, the eight-step PDCA-driven solution, architectural design, technical stack, monitoring standards, and the resulting improvements in data governance and operational efficiency across their massive data warehouse ecosystem.
Background
Data has become a critical asset for internet companies. Meituan’s large‑scale data warehouse processes tens of thousands of offline and real‑time jobs each day, requiring a unified, efficient data‑quality monitoring solution.
Challenges
Lack of a unified monitoring view for offline and real‑time jobs, leading to fragmented insights.
Missing data‑quality metrics and delayed validation, causing inconsistent data definitions.
Non‑closed‑loop fault handling and absence of a centralized knowledge base.
Insufficient monitoring of data‑model quality, resulting in duplicated models and information silos.
Rapid growth of storage resources without fine‑grained monitoring.
PDCA Process for Data Quality Lifecycle
Quality demand : discover data issues, collect requirements, define validation rules.
Rule refinement : identify effective indicators and measurement standards.
Rule‑engine construction : configure objects, schedules, and scopes.
Execute checks : schedule and run validation code.
Problem detection : display, classify, and grade issues.
Analysis reports : generate quality reports, trend analysis, and solution consensus.
Implement fixes : execute, track, review, and standardize solutions.
Knowledge‑base formation : summarize experiences into a standardized repository.
Quality Inspection Standards
Completeness – missing entities, attributes, records, or field values.
Accuracy – consistency with expected or acceptable values.
Reasonableness – correct format, type, domain, and business‑rule validity.
Consistency – alignment across systems and unified business metrics.
Timeliness – ETL latency, job runtime, and dependency promptness.
Overall Architecture
Four‑layer Design
Data source & marketplace layer – collects metadata, logs, and real‑time streams from Hive, Spark, Storm, Kafka, MySQL, etc., and builds a unified quality data mart.
Storage model layer – stores rule‑engine results and quality metrics in relational databases.
System function layer – provides configuration management, process monitoring, issue tracking, fault workflow, real‑time monitoring, and knowledge‑base creation.
Presentation layer – UI for dashboards, recommendation engine, issue reporting, fault tracking, and permission management.
The front‑end uses Bootstrap, FreeMarker and Tomcat (MVC). The back‑end relies on Spring 4, Spring Boot, Hibernate and the Zebra middleware for high‑availability, read/write‑splitting MySQL access.
Data Flow
Metadata (warehouse schemas, job logs, monitoring logs) is ingested, processed by ETL jobs, and stored in the quality data mart. Application services query this mart to drive quality dashboards, alerts, and knowledge‑base updates.
System Features
Personal workbench – aggregates personal tasks, alerts, and pending optimizations.
Offline monitoring – scheduled checks on models, jobs, storage, and resource usage.
Real‑time monitoring – compares running jobs against baselines and triggers alerts.
Recommendation engine – automatically suggests storage, model, or job optimizations based on rule violations.
Public‑account bot – pushes notifications, task assignments, and risk assessments.
Fault handling – supports automatic and manual issue reporting, closed‑loop workflow, and review.
Technical Stack
Front‑end: Bootstrap (jQuery‑based UI components), FreeMarker template engine, Tomcat servlet container. Back‑end: Spring 4, Spring Boot, Hibernate, Druid connection pool, and Zebra middleware for MySQL high‑availability, read/write splitting, and sharding.
Zebra Middleware
Zebra is an official Meituan DBA‑recommended JDBC layer that provides dynamic configuration, monitoring, read/write separation, and database sharding. It routes connections directly to MySQL instances, works with RDS for configuration management, and integrates MHA for master‑slave high availability.
Data Model and Layers
Data source & marketplace layer gathers metadata from Hive, Spark, Storm, Kafka, MySQL, and BI applications, then creates a unified quality data mart.
Storage model layer stores rule‑engine configurations and quality‑metric results in relational databases, enabling fast queries for monitoring and impact analysis.
System function layer implements configuration management, process monitoring, issue tracking, fault workflow, real‑time monitoring, and knowledge‑base creation.
Presentation layer offers dashboards, recommendation modules, quality analysis views, issue submission, fault tracking, and permission management.
Key Processes
Quality demand – capture data problems and define validation rules.
Rule refinement – select effective indicators and define measurement standards.
Rule‑engine construction – configure objects, schedules, and scopes.
Execute checks – run scheduled validation jobs.
Problem detection – display, classify, and grade issues.
Analysis reports – produce quality reports, trend analysis, and consensus on solutions.
Implement fixes – execute, track, review, and standardize remediation.
Knowledge‑base formation – consolidate experiences into a reusable repository.
Outcome
The DataMan platform provides a closed‑loop data‑quality governance mechanism that improves visibility of data assets, reduces job latency, optimizes storage usage, and creates a knowledge‑driven feedback loop, thereby enhancing overall business performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
