Big Data 15 min read

User Segmentation Data‑Driven Operations Intelligent Decision Engine: Data Development Practice at NetEase Cloud Music

This article details the design, challenges, and implementation of NetEase Cloud Music's user‑segmentation data‑driven operation decision engine, covering project background, product architecture, data‑warehouse responsibilities, development workflow, optimization strategies, and the resulting performance and future outlook.

DataFunSummit
DataFunSummit
DataFunSummit
User Segmentation Data‑Driven Operations Intelligent Decision Engine: Data Development Practice at NetEase Cloud Music

01 Project Background

The "User Segmentation Data‑Driven Operations Intelligent Decision Engine" (code‑named "Nuo‑ren") was launched in late August to drive user growth for NetEase Cloud Music, focusing on three goals: new‑user acquisition, existing‑user activation, and churn recovery.

2. Product Flow

We decompose the product into user layering, business status, gap analysis, and clustering strategy. The strategy combines user activity, consumption, and production data to define targeted actions such as content or benefit outreach, with precise matching of users, items, and contexts.

3. Product Architecture

The architecture consists of a data input layer, an intelligent decision layer, a delivery channel layer (currently push‑based), and a data feedback layer. Multiple teams (data product, data development, front‑end/back‑end, testing) collaborate on the system.

4. Data‑Warehouse Responsibilities

The data‑warehouse serves as the lowest‑level data input layer, providing stable, fast data streams to the Nuo‑ren platform for continuous strategy iteration.

02 Project Challenges

1. Data‑Warehouse Role

Daily strategy data (user profiles, content matching, scenario data) are generated and fed to Nuo‑ren for user group selection, strategy matching, and one‑click delivery. The warehouse also handles full‑link effect analysis and feedback for strategy monitoring and optimization.

2. Challenges

Business complexity: multiple user identities (consumers, creators, artists, fans), diverse content (songs, playlists, videos, podcasts), and varied behavior (social interaction, playback) across many time windows.

Strategy complexity: need for both statistical/predictive tags and fine‑grained scenario data to support diverse copy.

3. Data‑Warehouse Challenges

Functional: clear metric definitions, data quality (validity, consistency, completeness), and strict interface standards.

Non‑functional: stable, scalable architecture; strict timeliness (daily scheduled pushes); resource‑cost management.

03 Project Solution

1. Prerequisites – Data Middle‑Platform & Standard System

All development revolves around NetEase Cloud Music's full‑link data middle‑platform, which includes a proprietary big‑data storage and compute platform, standardized data construction, cost‑effective data‑product tools, CI/CD pipelines, and user‑facing OLAP/Easyfetch tools.

We adopt a dimensional‑modeling approach, building independent layers for content and user domains, aggregating lightweight metrics at the DWS layer, and creating wide tables for downstream analysis.

2. Data‑Development Process & Mechanism

From requirement analysis to CDM layer delivery, the workflow includes data research, bus matrix design, model review, testing, quality monitoring, scheduling, and operation, leveraging CI/CD tools for quality assurance and efficiency.

3. Data‑Warehouse Optimization & Assurance

Cost‑reduction: plug‑in strategy packaging, effect‑based resource release.

Task optimization: dependency reduction, schedule adjustment, node‑level independence, SQL and engine tuning.

Model optimization: partitioning large tables, decoupling heavy models for faster output.

Non‑functional ops: baseline‑level operation, intelligent alerting, acceleration pools, visual monitoring for end‑to‑end stability.

04 Project Outcomes

1. Overall Data Flow Architecture

The final pipeline shows clear separation of user‑profile metrics, dimension layers, and fact tables, forming a stable, well‑structured data model.

2. Production Timeliness

Daily data delivery is kept within defined windows; early‑stage fluctuations were mitigated through CDM and market‑layer optimizations, and baseline ops further stabilized timing.

3. Delivery Effectiveness

Push performance ranks: social interaction > asset change > platform reminder > content recommendation, achieving up to 3% click‑through rates with delivery volumes ranging from tens of thousands to millions.

4. Summary of Achievements

Standardized data system guiding development.

Rigorous end‑to‑end R&D process.

Extensive compute optimizations (dependency, engine, SQL).

Baseline operation ensuring reliable production.

DataOps tooling improving quality and efficiency.

5. Future Outlook

Strengthen baseline operations.

Iterate product features to broaden strategy coverage.

Enhance data service capabilities and asset governance.

Thank you for listening.

Big Datauser segmentationdata warehousecloud musicdata opsintelligent decision engine
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.