Metadata Management and Governance Practices at Wing Payment: Architecture, Techniques, and Future Outlook
This article explains how Wing Payment uses metadata as the foundation of its data‑governance practice, describing the challenges of data quality, efficiency, cost and security, the four‑step governance framework, the design of its metadata platform, and future directions such as multi‑source management and intelligent recommendation.
Metadata is the cornerstone of enterprise data governance, enabling clean data, accurate analysis, and efficient decision‑making; Wing Payment’s session introduces the role of metadata in solving data quality, consistency, cost, efficiency, and security problems.
The discussion is organized into four parts: (1) locating metadata and its relationship to data governance; (2) building a governance system based on metadata, covering core data protection, master data governance, data standards, and product architecture; (3) key technologies of the metadata platform; (4) future outlook.
Core data protection addresses low data quality and timeliness by prioritizing high‑value tasks through a four‑step workflow: business submission, dependency analysis, priority adjustment, and 24/7 operations support, complemented by resource‑allocation strategies such as project‑space management, queue division, and time‑based resource policies.
Master data governance tackles difficulty in identifying core data and poor consistency by defining a single authoritative source, improving data quality, integrating master data across systems, and providing services for data registration, lineage, and consumption.
The data‑standard system establishes basic, asset, security, and lineage metadata, plus derived metrics like query counts, to ensure consistent data quality and compliance.
The metadata platform follows a three‑layer architecture: storage (HBase for basic metadata, Elasticsearch for indexes, graph DB for lineage), service (query, lineage analysis, external platform APIs), and ingestion (plugins for various sources, message‑queue‑driven processing). It supports post‑hoc collection, blind writes, batch and real‑time updates, and full‑link field‑level lineage via hooks on processing engines.
Future plans include managing heterogeneous multi‑source data (including unstructured assets), cross‑DC disaster recovery, and intelligent recommendation for faster metadata discovery.
A Q&A session clarifies the distinction between core and master data, task prioritization for offline jobs, manual identification of core tasks, data‑security governance components, and technical details of metadata collection and HBase storage.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.