Building and Evolving a Data Service Platform for NetEase Cloud Music
The article details how NetEase Cloud Music co‑built a unified data service platform with NetEase YouShu, describing its architecture, phased development from internal use to online high‑concurrency services, feature enhancements such as API marketplace, multi‑source support, parameter conversion, and future roadmap for broader data products.
Data service, as the top layer of a unified data middle platform, provides data‑warehouse information as services, shielding storage and computation details, simplifying usage, avoiding siloed construction, and improving API development efficiency and data utilization. NetEase Cloud Music's data team co‑built this platform with NetEase YouShu and applied it in real scenarios.
Data Service Platform Overview
The platform sits above the data warehouse, exposing unified APIs that read from online stores (HBase, Redis, DDB, etc.) after data is synchronized from Hive and processed by real‑time/offline compute layers, with management by data map, metadata center, and data quality center.
The development platform integrates API creation, testing, deployment, and invocation in a one‑stop solution and is incorporated into the NetEase YouShu big‑data suite.
Users are divided into two roles: service callers (business, front‑end, algorithm teams) who discover APIs in the API marketplace, view details, and request permissions; and service creators (data developers) who rapidly generate APIs via configuration or simple SQL, bind resources, test, publish, and set up alarm monitoring, reducing a previously Java‑code‑heavy workflow to minutes.
Phase 1: Internal Scenario Use and Functional Enrichment
Starting in Q4 2020, the platform used a V1 version built on an open‑source Kong gateway for API creation, testing, and publishing, with a management service handling data source, API, and application authorization, as well as service governance and monitoring.
APIs were accessed through Kong domains, decoupled from the platform after publishing, and initially served internal data‑reporting and product systems with low call volume.
Key enhancements included:
API Marketplace for browsing and requesting APIs.
Support for additional data sources (ClickHouse, Druid, Kylin, Elasticsearch) via JDBC, integrated with the metadata center.
Custom input‑parameter conversion and optional return formats using Java code, hiding storage details such as HBase rowkeys.
Implementation of code generation, dynamic compilation, and reflection for function parameters.
UDF support through uploaded JAR packages.
MyBatis‑style dynamic SQL for conditional queries.
Result of Phase 1: over 30 APIs, more than 10 callers, all deployed in a single resource group behind Kong.
Phase 2: Online Data Service and Platform Stability
From Q1 2021, the platform expanded to online scenarios requiring high concurrency, low latency, and integration with Cloud Music's backend stack. Stability features such as service governance, monitoring, and automated testing were added.
3.1 Support for Batch HTTP, RPC Frameworks
An ingress layer was introduced to decouple various HTTP gateways and RPC protocols from the core service layer, allowing easy addition of new protocols (e.g., Dubbo) and enabling per‑API caching via SpringBoot.
3.2 Weak Dependency Design
The query service now retrieves API and data‑source metadata asynchronously from the management service and caches it in memory, reducing latency and making the query service tolerant to management‑service outages.
3.3 Resource Isolation and Independent Deployment Online services are deployed in dedicated clusters and resource groups, with separate storage clusters (e.g., dedicated HBase for user‑profile service) to avoid interference with other workloads. 3.4 Service Governance and Monitoring Integration with internal tools such as Mlog, unified governance platform, alarm system, GOAPI for inspection, and NPT for pressure testing provides robust operational support, though these tools are not yet embedded in the data‑service UI. Result of Phase 2: the user‑profile online service, exposing tags via RPC, supports multiple business scenarios (membership activities, anniversary events, recommendation playlists) with peak QPS of 8,9000 and average response time around 3 ms. Future Outlook Planned work includes building more generic data services (song/artist profiles), API versioning and rolling releases, unified multi‑gateway governance with fine‑grained monitoring, an online function studio for UDF editing, and embracing cloud‑native resource cloudification. Author Bio Liu Yuan, senior data development engineer at NetEase Cloud Music, focuses on data service construction and its application within the platform.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.