Big Data 11 min read

DataOps 2.0: Integrated Data Development and Governance Practices at NetEase

The article recounts NetEase’s presentation at the inaugural DataOps conference, detailing the evolution from DataOps 1.0 pipeline to a 2.0 integrated data development‑governance model, the challenges faced, practical solutions, and strategic advice for data managers.

DevOps Cloud Academy
DevOps Cloud Academy
DevOps Cloud Academy
DataOps 2.0: Integrated Data Development and Governance Practices at NetEase

To further promote the value of DataOps and foster industry practice, the first DataOps Conference was held on April 20, 2023 in Shanghai, organized by the China Information and Communication Research Institute, China Communications Standardization Association, and Shanghai Economic and Information Commission.

At the conference, NetEase Vice President and NetEase DataFan General Manager Wang Yuan delivered a keynote titled “NetEase’s Integrated Data Development and Governance Practice Based on DataOps”.

He introduced NetEase’s history: the Hangzhou Research Institute founded in 2006, early work on distributed databases, big‑data platforms since 2014, BI products since 2015, and the concept of “data productivity” proposed in 2019, covering data technology, assets, applications, and operations.

The talk described the three core methodologies—DataOps, DataFusion, and DataProduct—forming a complete big‑data product matrix, including the underlying NDH platform, full‑lifecycle data development, and various upper‑layer modules.

DataOps 1.0 (2019‑2020) focused on building a pipeline from data development, testing to production, addressing issues such as missing task dependencies, lack of automated testing, and insufficient release control that once caused asset loss during an e‑commerce event.

The pipeline introduced code review, testing, and release approval, reducing bugs but still faced three main challenges: lack of standards (naming, security), siloed development, and inconsistent quality rules.

To overcome these, NetEase evolved to DataOps 2.0 , integrating data development and governance. The key principles are “design first, develop later” and “standards first, modeling later”. Standards are embedded into design, driving downstream development, testing, and continuous quality and security monitoring.

Integrated development‑governance yields full‑lifecycle metadata (technical, business, management), automatically generated during design, development, and consumption, helping enterprises achieve discoverability, understandability, trustworthiness, and manageability of data assets.

NetEase’s internal results include an 80% field‑standardization rate and improved efficiency through reusable common layers. External case studies, such as a securities firm, also showed gains in data standards, quality, and security.

Advice for data managers emphasizes three core principles: focus on data consumption to create value, establish a robust metric system, and explore intelligent automation (e.g., “design‑as‑development” using large‑language models and low‑code solutions).

Overall, the presentation highlighted the transition from isolated data governance projects to an integrated, long‑term governance model embedded within the data development lifecycle.

data engineeringBig Datadata platformdata managementdata governanceDataOps
DevOps Cloud Academy
Written by

DevOps Cloud Academy

Exploring industry DevOps practices and technical expertise.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.