Big Data 17 min read

How Tencent Tackles Data Governance Challenges with the WeData Platform

This article outlines Tencent's data governance challenges, its internal three‑stage practice, detailed case studies such as Tencent News and PCG cost governance, and introduces the WeData platform's architecture and tools for standardization, quality, security, and metadata management, concluding with a Q&A session.

Data Thinking Notes
Data Thinking Notes
Data Thinking Notes
How Tencent Tackles Data Governance Challenges with the WeData Platform

Introduction

The talk introduces the current stage and practical experience of Tencent's data governance and presents the WeData data governance platform built on these experiences.

Data Governance Challenges

Data governance faces three main categories of challenges:

Management challenges: data is scattered across many business departments and reporting locations, making unified management difficult; scaling with growing data volume while reducing manpower is a key issue.

Technical challenges: ensuring data quality after collection; poor quality can lead to negative business impact.

Business challenges: lack of underlying business metadata after data reporting prevents unified auditing or measurement.

Data Governance "Maslow Hierarchy"

Different enterprises or development stages encounter varying problems, which can be classified as:

Timeliness – ensuring data is produced and delivered promptly.

Quality – guaranteeing accuracy, completeness, and effectiveness.

Usability – data must be usable to retain value.

Security – safeguarding data sharing and application.

Cost – reducing labor, material, and compute resources while solving problems.

Tencent Internal Data Governance Practice

Business Landscape

Tencent operates many business groups (BGs) covering enterprise, entertainment, cloud, and content, with tens of thousands of business lines, hundreds of product lines, EB‑level data storage, and thousands of data analysts.

Three Stages of Data Governance

Stage 1: Data Assetization – turning data into valuable assets, the core goal of all data work.

Stage 2: Cost Reduction & Efficiency – lowering resource consumption while maintaining or improving assetization outcomes.

Stage 3: Platformization & Productization – abstracting common practices into a governance platform and product.

Case: Tencent News Data Assetization

Two main problems were addressed: lack of unified data standards and difficulty guaranteeing data quality. Solutions included unified tracking models, upgraded warehouse models, and metric models, resulting in 250 model designs/restructures, 52 dimension tables, and 270 application tables, achieving >95% data completeness, >73% reuse, and <5% cross‑layer references.

Case: PCG Data Cost Governance

Cost governance defined the scope (data collection, generation, analysis, and application platforms) and optimized both resource usage and unit cost, reducing absolute big‑data cost by at least 10% despite a 30%+ month‑over‑month cost increase.

Platformization of Governance

Four stages were defined: overview of the data warehouse and big‑data cost, detailed asset items, built‑in and custom governance solutions, and one‑stop execution of governance actions. A scoring system evaluates compliance, security, quality, cost, and application.

WeData Data Governance Platform Capabilities

Architecture

The platform consists of two parts: agile data production (modeling, integration, development, services) and efficient data governance (asset governance, quality, security, metadata assets).

Services

Data flow and storage are handled in the upper layer, while processing includes data aggregation, development, operations (table creation, script development, orchestration), and data services (API generation).

Standardization Tools

Includes data management (metrics, dimensions, metadata), warehouse planning (layering, domain definition, subject areas), and materialization of logical models into physical structures, with industry templates for rapid adaptation.

Quality Tools

Four steps: define quality rules (basic checks, custom SQL), bind rules to data (real‑time ETL monitoring or offline batch checks), handle incidents with alerts and work‑order flow, and generate periodic quality reports to identify recurring issues.

Security Tools

Three aspects: sensitive data identification (classification and labeling), privacy protection (static and dynamic masking/encryption with watermarking), and security auditing (tracking access, export, and download of sensitive data).

Metadata Asset Management

Metadata collection, technical metadata (tables, fields) and business metadata are linked to provide a data catalog, lineage, change records, and data temperature for comprehensive understanding.

Usability Governance

Three steps improve data search, understanding, and application: fast global data location, presenting technical and business metadata together, and delivering data in usable forms for consumers.

Practice Outcomes

Deep governance across standards, quality, security, and usability leads to cost reduction, higher data usage, and an enterprise‑level rating system that filters cold or duplicate data, enhancing overall data value.

Q&A

Q1: How are security tags on metadata propagated downstream? A1: WeData tags are independent and not propagated via lineage because upstream and downstream may belong to different business lines with differing sensitivity definitions.

Q2: How to collect and organize business metadata? A2: Business metadata is divided into five categories: normative, quality, security, cost, and application (including usage temperature).

Q3: What is the relationship between TBDS and WeData? A3: TBDS is the underlying data processing engine (EMR, CDW, etc.), while WeData is the data governance tool built on top of it.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataData PlatformData GovernanceTencentWeData
Data Thinking Notes
Written by

Data Thinking Notes

Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.