Big Data 20 min read

How Meituan Built DataMan: A Scalable Data Quality Monitoring Platform for Big Data

This article details Meituan's DataMan platform, describing the background of data quality challenges, the eight-step PDCA-driven solution, architectural design, technical stack, monitoring standards, and the resulting improvements in data governance and operational efficiency across their massive data warehouse ecosystem.

dbaplus Community
dbaplus Community
dbaplus Community
How Meituan Built DataMan: A Scalable Data Quality Monitoring Platform for Big Data

Background

Data has become a critical asset for internet companies. Meituan’s large‑scale data warehouse processes tens of thousands of offline and real‑time jobs each day, requiring a unified, efficient data‑quality monitoring solution.

Challenges

Lack of a unified monitoring view for offline and real‑time jobs, leading to fragmented insights.

Missing data‑quality metrics and delayed validation, causing inconsistent data definitions.

Non‑closed‑loop fault handling and absence of a centralized knowledge base.

Insufficient monitoring of data‑model quality, resulting in duplicated models and information silos.

Rapid growth of storage resources without fine‑grained monitoring.

PDCA Process for Data Quality Lifecycle

Quality demand : discover data issues, collect requirements, define validation rules.

Rule refinement : identify effective indicators and measurement standards.

Rule‑engine construction : configure objects, schedules, and scopes.

Execute checks : schedule and run validation code.

Problem detection : display, classify, and grade issues.

Analysis reports : generate quality reports, trend analysis, and solution consensus.

Implement fixes : execute, track, review, and standardize solutions.

Knowledge‑base formation : summarize experiences into a standardized repository.

Quality Inspection Standards

Completeness – missing entities, attributes, records, or field values.

Accuracy – consistency with expected or acceptable values.

Reasonableness – correct format, type, domain, and business‑rule validity.

Consistency – alignment across systems and unified business metrics.

Timeliness – ETL latency, job runtime, and dependency promptness.

Overall Architecture

Four‑layer Design

Data source & marketplace layer – collects metadata, logs, and real‑time streams from Hive, Spark, Storm, Kafka, MySQL, etc., and builds a unified quality data mart.

Storage model layer – stores rule‑engine results and quality metrics in relational databases.

System function layer – provides configuration management, process monitoring, issue tracking, fault workflow, real‑time monitoring, and knowledge‑base creation.

Presentation layer – UI for dashboards, recommendation engine, issue reporting, fault tracking, and permission management.

The front‑end uses Bootstrap, FreeMarker and Tomcat (MVC). The back‑end relies on Spring 4, Spring Boot, Hibernate and the Zebra middleware for high‑availability, read/write‑splitting MySQL access.

Data Flow

Metadata (warehouse schemas, job logs, monitoring logs) is ingested, processed by ETL jobs, and stored in the quality data mart. Application services query this mart to drive quality dashboards, alerts, and knowledge‑base updates.

System Features

Personal workbench – aggregates personal tasks, alerts, and pending optimizations.

Offline monitoring – scheduled checks on models, jobs, storage, and resource usage.

Real‑time monitoring – compares running jobs against baselines and triggers alerts.

Recommendation engine – automatically suggests storage, model, or job optimizations based on rule violations.

Public‑account bot – pushes notifications, task assignments, and risk assessments.

Fault handling – supports automatic and manual issue reporting, closed‑loop workflow, and review.

Technical Stack

Front‑end: Bootstrap (jQuery‑based UI components), FreeMarker template engine, Tomcat servlet container. Back‑end: Spring 4, Spring Boot, Hibernate, Druid connection pool, and Zebra middleware for MySQL high‑availability, read/write splitting, and sharding.

Zebra Middleware

Zebra is an official Meituan DBA‑recommended JDBC layer that provides dynamic configuration, monitoring, read/write separation, and database sharding. It routes connections directly to MySQL instances, works with RDS for configuration management, and integrates MHA for master‑slave high availability.

Data Model and Layers

Data source & marketplace layer gathers metadata from Hive, Spark, Storm, Kafka, MySQL, and BI applications, then creates a unified quality data mart.

Storage model layer stores rule‑engine configurations and quality‑metric results in relational databases, enabling fast queries for monitoring and impact analysis.

System function layer implements configuration management, process monitoring, issue tracking, fault workflow, real‑time monitoring, and knowledge‑base creation.

Presentation layer offers dashboards, recommendation modules, quality analysis views, issue submission, fault tracking, and permission management.

Key Processes

Quality demand – capture data problems and define validation rules.

Rule refinement – select effective indicators and define measurement standards.

Rule‑engine construction – configure objects, schedules, and scopes.

Execute checks – run scheduled validation jobs.

Problem detection – display, classify, and grade issues.

Analysis reports – produce quality reports, trend analysis, and consensus on solutions.

Implement fixes – execute, track, review, and standardize remediation.

Knowledge‑base formation – consolidate experiences into a reusable repository.

Outcome

The DataMan platform provides a closed‑loop data‑quality governance mechanism that improves visibility of data assets, reduces job latency, optimizes storage usage, and creates a knowledge‑driven feedback loop, thereby enhancing overall business performance.

Data quality monitoring platform overall framework
Data quality monitoring platform overall framework
Data quality PDCA process
Data quality PDCA process
Data quality monitoring function diagram
Data quality monitoring function diagram
System UI screenshot
System UI screenshot
Real‑time job monitoring chart
Real‑time job monitoring chart
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataData Qualityplatform architectureData GovernanceMeituanDataMan
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.