Big Data 21 min read

How Kuaishou Built a Standardized Data Governance Evaluation System

This article outlines Kuaishou’s approach to establishing a standardized data governance evaluation framework, detailing the challenges of large‑scale data management, the design of assessment metrics across model, quality, and cost dimensions, and the practical strategies and operational mechanisms used to improve data asset health and business value.

DataFunTalk
DataFunTalk
DataFunTalk
How Kuaishou Built a Standardized Data Governance Evaluation System

Kuaishou operates a massive data platform where daily decision‑making and analysis heavily rely on data; as data volume and usage scenarios grow, challenges such as rising costs and data quality issues emerge, necessitating targeted governance to simplify and efficiently serve business needs.

The governance process faces four main challenges: the long data lineage requiring standardization and focus, quantifying governance outcomes, adapting focus across different stages (standardization, cost control, quality), and motivating teams under heavy business pressure.

To address these, Kuaishou devised a standardized evaluation system that transforms the four challenges into a concrete assessment framework, enabling systematic measurement and improvement.

The overall implementation strategy includes four objectives: problem standardization, quantifiable governance, process‑oriented strategy, and operational levers. Tactics involve metadata‑driven governance, an asset‑health scoring model covering model, quality, cost, and service dimensions, a scoring strategy to translate diverse metrics into a unified percentage, and operational mechanisms to ensure execution.

**Asset health – Model**: Issues such as unavailable data, inconsistent metrics, and low retrieval efficiency are tackled by defining service goals, establishing model design standards, and instituting review groups to ensure models meet business requirements.

**Asset health – Quality**: Quality problems span the entire pipeline, from lack of awareness to insufficient monitoring. Solutions include front‑end metadata monitoring, proactive issue detection, and comprehensive monitoring rule coverage to reduce fault recurrence.

**Asset health – Cost**: High data volume, low cost awareness, and stale data lead to waste. Kuaishou optimizes the big‑data engine (compression, replica reduction, hot‑data handling), classifies assets into tiers (A1‑A3) with tailored storage and lifecycle policies, and implements quota management, cost billing, and governance ranking to drive cost efficiency.

The scoring strategy unifies disparate metrics using Max‑Min normalization, coefficient‑of‑variation weighting, and manual adjustments to reflect stage‑specific priorities, creating a feedback loop that aligns governance actions with score improvements.

Operational mechanisms combine awareness campaigns, governance‑driving activities, incentives (recognition, rewards), and a blend of soft (training) and hard (access restrictions) enforcement to ensure sustained governance compliance.

Results show the data‑warehouse health index rising from 58 to 77 points, over 95% participation in governance, significant storage and compute cost savings, a >40% reduction in quality faults, and noticeably higher business satisfaction with data services.

Future plans focus on pre‑emptive governance—embedding standards and checks into production—and a one‑click, fully integrated governance platform to further reduce time and effort.

In summary, Kuaishou’s comprehensive data‑governance evaluation system and its operational rollout demonstrate measurable improvements in cost, quality, efficiency, and overall data value.

big datadata qualitycost optimizationdata governanceEvaluation FrameworkKuaishou
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.