How Kuaishou Scaled Metadata Management for Big Data: Architecture & Lessons
This article outlines Kuaishou's evolution of metadata management from its early Hive‑centric stage to a unified 2.0 platform, detailing system architecture, key technologies, challenges, and future 3.0 vision for low‑code, automated, and intelligent data governance.
1 Kuaishou Metadata Management Background
Metadata describes data such as table schemas, BI dashboards, datasets, and metric models. Kuaishou's metadata platform collects dozens of types including tables, metric models, AB tasks, and analysis dashboards.
Metadata management at Kuaishou evolved through three stages:
Initial stage . In 2017 the data platform was early, warehouse based on Hive, and the metadata platform mainly integrated Hive with limited types.
Platform 1.0 . In 2019 business growth introduced Kafka, Druid and other engines; the metadata platform expanded to multiple storage engines and added product capabilities such as data discovery.
Platform 2.0 . In 2020 Kuaishou heavily invested in a data middle platform, adding real‑time and batch development, unified scheduling, data services, metric models, and BI. The metadata platform collected metadata across production to consumption and offered rich capabilities like lineage analysis and data governance.
2 Metadata Management 2.0 Architecture and Key Technologies
2.1 System Architecture
Kuaishou built the 2.0 platform around unification and proactiveness :
Unification : consolidate siloed components to improve efficiency.
Proactiveness : start from business value and build metadata services.
Challenges faced:
Business complexity : dozens of heterogeneous entities and relationships.
Massive scale : billions of entities with high incremental volume.
Collaboration overhead : cross‑department communication.
Diverse applications : many metadata use cases with high service‑quality requirements.
The platform adopts a “3+1” construction: three unified layers plus an application layer.
Unified Ingestion : integrated workflow (ingest → parse → process → output) supporting incremental and full‑load of entities and lineage.
Unified Storage : JanusGraph + Atlas as primary store, backed by HBase, with Elasticsearch for high‑performance queries.
Unified Services : aggregated by service type into unified API, messaging, and data‑warehouse services.
Metadata Applications : products for data discovery, governance, and remediation.
2.2 Key Technologies
Core techniques include unified ingestion, unified storage, quality assurance, lineage analysis, and automated data analysis.
Unified Ingestion
Standardized metadata ingestion eliminates siloed development, reuses base entity definitions, and streamlines ETL processing.
Unified Storage
With over 30 heterogeneous entity types and billions of records, Kuaishou stores metadata in Atlas + JanusGraph on HBase, adding Elasticsearch for hot queries.
Quality Assurance
Two mechanisms ensure high‑quality metadata:
Entity consistency : full‑refresh cycles compare external sources and apply tiered repair, prioritizing critical entities.
Lineage accuracy : SQL parsing via ANTLR builds table/field/point lineage; critical changes are version‑compared and auto‑attributed, with blocking on unresolved anomalies.
Lineage Analysis
Global lineage enables impact analysis and fault tracing. Simple synchronous queries preview nearby upstream/downstream entities, while multi‑dimensional asynchronous analysis uses BFS traversal with pruning, decoupled from the graph engine for scalability.
Metadata‑Driven Tiering
Automatic tiering classifies entities based on benchmark metadata, propagating levels upward to prioritize consistency and timeliness for high‑priority (P0) assets.
3 Metadata‑Driven Asset Applications and Data Governance
Metadata supports three major scenarios: production efficiency, consumption efficiency, and cross‑domain data management.
3.1 Data Map
The data map offers three capabilities:
Searchable : indexed multi‑attribute search with 90 % hit rate.
Discoverable : curated business catalog in a tree structure.
Understandable : enriched context such as production lineage, usage cases, and sample data.
3.2 Asset Management
The platform manages over 20 asset types, providing “can manage”, “can analyze”, and “can mine” functions.
3.3 Data Cost Governance
Metadata drives cost governance by linking resource billing to lifecycle stages, enabling both manual and automated policies such as lifecycle‑based cleanup and tiered storage.
4 Outlook for Metadata Management 3.0
Version 1.0 was manual, 2.0 introduced a unified platform, and 3.0 will focus on low‑code, automation, and intelligence to create a proactive metadata platform and enable smart data management.
Low‑code & Automation : automatic collection of all metadata without system‑level integration.
Intelligence : metadata cloud connects data for on‑demand access, supporting intelligent scheduling optimization and global timeliness improvements.
Kuaishou Big Data
Technology sharing on Kuaishou Big Data, covering big‑data architectures (Hadoop, Spark, Flink, ClickHouse, etc.), data middle‑platform (development, management, services, analytics tools) and data warehouses. Also includes the latest tech updates, big‑data job listings, and information on meetups, talks, and conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.