Big Data 16 min read

Data Governance Practices and Logical Closed‑Loop at KuaiKan

The talk outlines KuaiKan's data governance journey, describing the rapid business growth challenges, the three‑step logical closed‑loop framework, practical experiences in business scope management, data asset governance, collaboration techniques, and future outlook, highlighting evaluation metrics and ongoing improvements.

DataFunTalk

Aug 13, 2022

Data Governance Practices and Logical Closed‑Loop at KuaiKan

Speaker : Qu Shichao, Head of Data Development at KuaiKan.

Introduction : KuaiKan, founded in 2014, evolved from a comic platform to a Z‑generation interest community with over 340 million users and 50 million monthly active users, leading to rapid data volume expansion and governance challenges.

Governance Background : The fast‑growing business lines caused data construction pressure, scattered data sources, low data quality, and inefficient development due to lack of standardized modeling.

Three‑Step Governance Path :

Step 1 – Break through a single core business line with the most severe data quality issues.

Step 2 – Consolidate governance strategies from the first line and migrate them across other lines.

Step 3 – Replicate the MVP solution to other businesses, reducing redundant effort.

Logical Closed‑Loop : The framework consists of three parts – business scope management, data asset governance, and application feedback – forming a continuous improvement cycle.

Business Scope Management : Track business feature changes, manage indicator priorities, and maintain a knowledge‑base of process and data source changes to keep data teams aligned.

Data Asset Management : Establish governance standards, develop platform tools (e.g., metadata, lineage, quality monitoring), and enforce end‑to‑end data pipeline norms from source to warehouse.

Collaboration Techniques : Align data product analysts, developers, and business owners, start with high‑impact core services, define MVP processes, and set unified output standards with regular retrospectives.

Summary & Outlook :

Effectiveness is measured by reduced indicator duplication, higher warehouse data reuse, and shorter development cycles.

Current limitations include slow tool iteration due to manpower and incomplete cross‑domain governance.

Future plans focus on incremental improvements, expanding lineage management, and strengthening data quality at the source.

Q&A : The data lineage is stored in MySQL, handling a few thousand daily tasks, with no immediate performance issues.

Thank you for attending the session.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Data Quality Data Governance

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.