Big Data 20 min read

Data Governance Practices at NetEase Cloud Music: Warehouse Overview, Data Standards, Event Tracking, and Asset Management

This article details NetEase Cloud Music's data governance journey, covering the challenges of massive and complex data, the design of a multi‑layered data warehouse, the establishment of data and event‑tracking standards, asset lifecycle management, and future automation plans.

DataFunTalk
DataFunTalk
DataFunTalk
Data Governance Practices at NetEase Cloud Music: Warehouse Overview, Data Standards, Event Tracking, and Asset Management

Introduction – In the era of big data, NetEase Cloud Music recognized the value of data assets and began exploring data governance to improve data quality, standardization, and cost efficiency.

Music Data Warehouse Overview – The warehouse faces large data volume, complex business scenarios, and historical baggage, leading to challenges in data quality, compute control, and storage cost.

Data Standards – Governance starts with design and development standards: top‑level data modeling, domain segmentation (participants, services/products, facts, agreements, public data), layered architecture (ODS, DIM, DWD, DWS, application layer), and platform‑driven enforcement via the NetEase Data Platform.

Event‑Tracking Governance – Issues such as inconsistent field definitions, low data quality, and manual implementation are addressed by a unified SDK, object‑oriented event models, standardized JSON payloads, and a workflow that integrates planning, front‑end configuration, SDK reporting, QA validation, and production deployment.

Asset Governance – Focuses on reducing compute and storage costs through data‑flow governance, task decomposition, lifecycle management, and automated cleanup; examples show significant reductions in task runtime and storage usage.

Outlook – Future goals include visualizing data assets, static code analysis with performance alerts, and a health‑score system to guide efficient data production and usage.

Q&A Highlights – Governance has yielded standardized data models, improved event tracking, and decommissioned over 100 redundant reports, saving more than 1 PB of storage.

Big DataData Warehouseevent trackingdata governanceAsset ManagementCloud Music
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.