Data Governance Practices and Logical Closed‑Loop at KuaiKan
The talk outlines KuaiKan's data governance journey, describing the rapid business growth challenges, the three‑step logical closed‑loop framework, practical experiences in business scope management, data asset governance, collaboration techniques, and future outlook, highlighting evaluation metrics and ongoing improvements.
Speaker : Qu Shichao, Head of Data Development at KuaiKan.
Introduction : KuaiKan, founded in 2014, evolved from a comic platform to a Z‑generation interest community with over 340 million users and 50 million monthly active users, leading to rapid data volume expansion and governance challenges.
Governance Background : The fast‑growing business lines caused data construction pressure, scattered data sources, low data quality, and inefficient development due to lack of standardized modeling.
Three‑Step Governance Path :
Step 1 – Break through a single core business line with the most severe data quality issues.
Step 2 – Consolidate governance strategies from the first line and migrate them across other lines.
Step 3 – Replicate the MVP solution to other businesses, reducing redundant effort.
Logical Closed‑Loop : The framework consists of three parts – business scope management, data asset governance, and application feedback – forming a continuous improvement cycle.
Business Scope Management : Track business feature changes, manage indicator priorities, and maintain a knowledge‑base of process and data source changes to keep data teams aligned.
Data Asset Management : Establish governance standards, develop platform tools (e.g., metadata, lineage, quality monitoring), and enforce end‑to‑end data pipeline norms from source to warehouse.
Collaboration Techniques : Align data product analysts, developers, and business owners, start with high‑impact core services, define MVP processes, and set unified output standards with regular retrospectives.
Summary & Outlook :
Effectiveness is measured by reduced indicator duplication, higher warehouse data reuse, and shorter development cycles.
Current limitations include slow tool iteration due to manpower and incomplete cross‑domain governance.
Future plans focus on incremental improvements, expanding lineage management, and strengthening data quality at the source.
Q&A : The data lineage is stored in MySQL, handling a few thousand daily tasks, with no immediate performance issues.
Thank you for attending the session.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
