Mobile Development 16 min read

Master Data Management: Concepts, Architecture, and Practical Implementation in Baidu Smart Mini Programs

The article outlines master data management concepts and maturity levels, then details Baidu Smart Mini Program’s practical architecture—spanning analysis, domain‑driven design, high‑availability services, transaction handling, caching, real‑time sync, and governance—that eliminates data silos, ensures consistency, and supports over 9,000 QPS with 99.99% SLA.

Baidu Geek Talk

Dec 20, 2021

Master Data Management: Concepts, Architecture, and Practical Implementation in Baidu Smart Mini Programs

Introduction

The article discusses the challenges of data silos, shared data management, and performance of data services in enterprises. It presents a practical summary of Baidu Smart Mini Program master data (MD) experience, focusing on improving data model quality, collaborative sharing, and evolving a high‑availability master data management service.

1. Master Data Concept

Master Data (MD) refers to shared data across systems (e.g., customers, accounts, organizations). Its core value is solving data sharing and consistency.

MDM maturity can be divided into six levels; the article outlines Levels 0‑3:

Level 0: No MDM; each application manages its own data.

Level 1: List provision – manual registration of data (add, delete, update, conflict handling).

Level 2: Equal access via interfaces – centralized metadata repository, still separate storage but with unified definitions.

Level 3: Centralized bus processing – a common platform enforces strong constraints, guarantees performance, stability, and transaction safety.

1.2 Why Master Data?

Four main problems drive the need for MD:

Data fragmentation and redundancy waste storage and degrade quality.

Inconsistent data makes calibration difficult, wasting time and resources.

Low‑efficiency business collaboration.

Frequent changes cause data loss if not centrally governed.

These stem from the lack of top‑down data governance.

2. Master Data Architecture – Practical Summary

2.1 Business Background Analysis

The Baidu Smart Mini Program faces rapid growth, increasing data change and retrieval demands, inconsistent SLA standards across services, mesh‑style data storage leading to redundancy, inconsistency, security issues, and data islands, as well as divergent data understanding across systems.

2.2 Overall Architecture Design

The design follows an analysis‑design‑implementation flow. In the analysis phase, data flow diagrams (DFD) and event‑storming are used to map requirements to domain boundaries. The design phase produces use‑case, state, entity, sequence, and ER diagrams, and defines service boundaries using a “cake‑cutting” approach.

2.2.1 Analysis Phase

Requirements are abstracted into a problem space, then mapped to a solution space. Domain boundaries are identified via event‑storming.

2.2.2 Design & Implementation Phase

Business domains are split into four sub‑domains:

Product Domain – core data models for mini‑programs, packages, categories, levels, benefits.

Base Data Domain – basic data such as categories and hosts.

User Domain – users, roles, permissions.

Customer Domain – entities, qualifications, etc.

2.3 Design Goals

Eliminate data redundancy and ensure consistency & security.

Unify data understanding.

Provide high‑availability unified data management.

Enable data sharing across systems, products, and departments.

2.3.1 Architecture & Practice

Data transmission service (supporting RDBMS, NoSQL, OLAP) provides migration, real‑time sync, and subscription capabilities across public cloud, private cloud, and PaaS.

2.3.3 Key Technical Implementations

2.3.3.1 Transaction/Compensation – Large transactions are split into small batches (100‑500 rows per commit) to avoid latency and IO overload. Compensation mechanisms handle delayed or failed operations.

2.3.3.2 High‑Performance Read/Write & Search – Multi‑level caching (distributed & local) for read‑heavy scenarios, database optimizations (batch updates, pagination, sub‑queries, indexing), and Elasticsearch for fuzzy and multi‑table searches.

2.3.3.3 Availability – Micro‑service architecture with service governance, unified configuration, distributed tracing, load balancing, retry, rate‑limiting, circuit‑breaking, and self‑healing mechanisms.

2.3.3.4 Process Mechanisms – Dedicated design‑review team, coding standards, regular code reviews, comprehensive system‑level testing, staged rollout (small traffic, gray release, automated rollback), and robust operations (lifecycle management, permission control, observability).

2.3.3.5 Real‑Time Data Sync – Binlog listening, concurrent MQ writes, version‑controlled data distribution, and compensation services ensure timely and reliable data propagation.

Summary

Since 2019, Baidu’s master data service supports >9,000 QPS, maintains >99.99% SLA, and serves more than ten core business lines. Data consistency reaches four nines through monitoring and compensation.

3. Extensions & Team Capability

Beyond real‑time online services, master data can be used for data‑asset auditing, monitoring, analytics, and business profiling. To improve team capability, the article suggests mandatory master‑data training, establishing a dedicated neutral team, and strengthening model & architecture reviews.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Governance Baidu Mini Programs Master Data Management

Written by

Baidu Geek Talk

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.