Tag

data lineage

2 views collected around this technical thread.

DataFunSummit
DataFunSummit
Jun 6, 2025 · Big Data

How Unicom Digital’s Integrated Data Platform Revolutionizes Metadata Management

This article details Unicom Digital’s metadata management practice on its integrated data platform, covering the strategic background of data, key challenges, award-winning capabilities, three-pronged solutions—automation, linking+, and AI—along with practical implementations, full‑chain lineage, data responsibility, lifecycle management, and future AI‑driven enhancements.

AIAutomationBig Data
0 likes · 18 min read
How Unicom Digital’s Integrated Data Platform Revolutionizes Metadata Management
DataFunSummit
DataFunSummit
Apr 13, 2025 · Big Data

Data Governance at Didi: Interview with Liu Chao on Big Data Asset Management

In this interview, Didi data governance lead Liu Chao discusses his career journey, the unique technical architecture of Didi’s big‑data governance system, cost‑driven pricing models, metadata management, lineage extraction, automation practices, and offers practical advice for enterprises seeking effective data governance.

AutomationBig DataCost-based Pricing
0 likes · 12 min read
Data Governance at Didi: Interview with Liu Chao on Big Data Asset Management
DataFunSummit
DataFunSummit
Jan 1, 2025 · Big Data

Douyin Group Data Asset Management Platform: Full‑Stack Data Lineage Evolution and Applications

This article introduces Douyin Group’s end‑to‑end data asset management platform, explains the evolution and architecture of its large‑scale data lineage system, presents quality metrics and ecosystem components, and outlines practical applications and future directions for data governance, development, and security.

Big DataDouyindata asset platform
0 likes · 16 min read
Douyin Group Data Asset Management Platform: Full‑Stack Data Lineage Evolution and Applications
JD Retail Technology
JD Retail Technology
Dec 19, 2024 · Big Data

JD.com Data Governance: Architecture, Key Technologies, and Future Directions

JD.com’s data‑governance framework combines a health‑score‑driven, automated platform that cross‑verifies audit logs, builds full‑link and operator‑level lineage, introduces standard fields, and optimizes resource mixing, task staggering, and cross‑datacenter scheduling, while targeting real‑time AI‑enhanced detection and full automation.

Audit LogsBig DataJD.com
0 likes · 15 min read
JD.com Data Governance: Architecture, Key Technologies, and Future Directions
DataFunSummit
DataFunSummit
Dec 10, 2024 · Big Data

JD.com’s Big Data Governance: Practices, Key Technologies, and Future Outlook

This article presents JD.com’s comprehensive big‑data governance experience, detailing the background and challenges, the automated governance platform and its core technologies such as audit logs and full‑link lineage, strategies for resource optimization, and the roadmap toward real‑time, intelligent, and fully automated data governance.

AutomationBig DataJD.com
0 likes · 14 min read
JD.com’s Big Data Governance: Practices, Key Technologies, and Future Outlook
ByteDance Data Platform
ByteDance Data Platform
Nov 27, 2024 · Big Data

Inside Douyin’s Data Asset Platform: Transforming Data Lineage and Governance

Douyin Group’s data asset management platform introduces a systematic "manage, find, use" approach that unifies metadata collection, full‑coverage data lineage, and a suite of applications across development, governance, asset utilization, and security, while outlining its architecture, modeling, quality metrics, and future roadmap.

Big Datadata governancedata lineage
0 likes · 14 min read
Inside Douyin’s Data Asset Platform: Transforming Data Lineage and Governance
DataFunTalk
DataFunTalk
Nov 10, 2024 · Big Data

Douyin Group Data Asset Management Platform and Data Lineage Architecture Overview

This article provides a comprehensive overview of Douyin Group's data asset management platform, detailing the evolution, architecture, and applications of its large‑scale data lineage system, and discusses future directions for enhancing data quality, cost efficiency, and security across the organization.

Big Datadata governancedata lineage
0 likes · 15 min read
Douyin Group Data Asset Management Platform and Data Lineage Architecture Overview
DataFunTalk
DataFunTalk
Oct 11, 2024 · Artificial Intelligence

E‑commerce Innovation and Data Governance: Summaries of Recent Research Topics

This article compiles concise overviews of recent e‑commerce research, covering real‑time online learning re‑ranking models, causal inference for user growth, full‑link data lineage, TikTok's data governance and attribution solutions, Volcano Engine's metric management, AI Agent applications on 1688, and XinXuan Group's live‑stream data architecture.

AIcausal inferencedata governance
0 likes · 5 min read
E‑commerce Innovation and Data Governance: Summaries of Recent Research Topics
DataFunSummit
DataFunSummit
Aug 28, 2024 · Big Data

Building Data Lineage Foundations and Applications for E‑commerce Scenarios

This article explains how to construct a full‑link data lineage platform for e‑commerce, detailing its architecture, quality metrics, and practical uses such as table migration, field‑level tracing, and automated metric decomposition to improve data governance and efficiency.

Big DataData Warehousedata governance
0 likes · 14 min read
Building Data Lineage Foundations and Applications for E‑commerce Scenarios
DataFunTalk
DataFunTalk
Jun 23, 2024 · Big Data

Building Full-Chain Data Lineage for E‑commerce Scenarios

This article explains how to construct a full‑chain data lineage system for e‑commerce, covering the concepts of data lineage, the design of a lineage foundation, quality measurement, application‑level lineage, and practical use cases such as table migration, field‑level tracing, and automated metric decomposition.

Big DataData Warehousedata governance
0 likes · 12 min read
Building Full-Chain Data Lineage for E‑commerce Scenarios
Beijing SF i-TECH City Technology Team
Beijing SF i-TECH City Technology Team
May 30, 2024 · Big Data

Data Lineage System Design and Implementation for Big Data Platforms

This article presents a comprehensive data lineage system (Data-Lineage) for big data platforms, addressing challenges in heterogeneous data sources, multiple execution engines, and complex dependencies through hook-based architecture and modular design.

SQL parsingbig data architecturedata lineage
0 likes · 12 min read
Data Lineage System Design and Implementation for Big Data Platforms
DataFunTalk
DataFunTalk
Mar 9, 2024 · Big Data

Construction and Application of Tencent Oula Data Lineage Platform

This article presents a comprehensive overview of Tencent Oula's data lineage system, detailing its background, goals, architecture, modular construction, key technologies such as graph databases and SQL parsing, and various internal application scenarios including data governance, cost insight, and baseline monitoring.

Big DataSQL parsingcost analysis
0 likes · 20 min read
Construction and Application of Tencent Oula Data Lineage Platform
DataFunTalk
DataFunTalk
Nov 23, 2023 · Big Data

Tencent PCG Data Governance System: Architecture, Asset Scoring, and One‑Stop Governance Platform

The article presents Tencent PCG's comprehensive data governance solution, detailing the challenges of massive, heterogeneous data, the four‑chapter framework covering governance overview, meta‑warehouse construction, an open asset‑scoring system, and a one‑stop governance workbench, and explains how lineage, scoring, and rule‑engine mechanisms enable cost‑effective, continuous data governance.

Asset ScoringBig Datadata governance
0 likes · 14 min read
Tencent PCG Data Governance System: Architecture, Asset Scoring, and One‑Stop Governance Platform
Bilibili Tech
Bilibili Tech
Sep 15, 2023 · Big Data

Introducing Bilibili's SQLScan: Architecture, Key Technologies, and Production Impact

Bilibili's SQLScan is a static‑code analysis tool that parses Hive, Spark, Presto and Flink SQL via Antlr4, builds a unified AST, applies engine‑specific metadata plugins for rule enforcement, provides field‑lineage and cost‑analysis services, and has processed hundreds of thousands of daily queries, intercepting thousands of problematic statements to improve data quality and operational efficiency.

Big DataBilibiliDataOps
0 likes · 11 min read
Introducing Bilibili's SQLScan: Architecture, Key Technologies, and Production Impact
Weimob Technology Center
Weimob Technology Center
Aug 1, 2023 · Big Data

How Weimeng Transformed Data Asset Governance: A Practical Blueprint for Enterprises

Facing fragmented metadata, unclear ownership, and costly data duplication, Weimeng implemented a comprehensive data asset governance framework—covering metadata standards, lineage visualization, metric normalization, and cost management—to boost data quality, security, and business value across its new‑retail platform.

Big Datadata governancedata lineage
0 likes · 15 min read
How Weimeng Transformed Data Asset Governance: A Practical Blueprint for Enterprises
DataFunSummit
DataFunSummit
Jun 29, 2023 · Big Data

iQIYI Data Link Governance: Offline and Real‑time Pipeline Management and Exploration

This article presents iQIYI’s comprehensive data link governance practice, covering the motivations, offline and real‑time pipeline governance strategies, monitoring mechanisms, data lineage, and exploratory work such as intelligent attribution and field‑level lineage to improve data accuracy, timeliness, and reliability.

Offline ProcessingReal-time Monitoringdata governance
0 likes · 11 min read
iQIYI Data Link Governance: Offline and Real‑time Pipeline Management and Exploration
政采云技术
政采云技术
Jun 15, 2023 · Big Data

Optimizing Data Lineage Extraction Using Spline REST API

This article discusses the practical implementation of extracting table and field lineage information via the Spline REST API, analyzing API call frequency, server load tolerance, and the strategy of re-parsing lineage only when job versions change to optimize performance.

Big DataData EngineeringREST API
0 likes · 5 min read
Optimizing Data Lineage Extraction Using Spline REST API
DataFunSummit
DataFunSummit
Jun 4, 2023 · Fundamentals

The Role of Metadata in Data Governance and Its Applications

Metadata serves as a foundational element of data governance, enabling analysis, monitoring, discovery, and understanding of data assets, while applications such as data lineage, impact analysis, and data mapping help organizations assess quality, trace origins, and optimize processing workflows.

Big Datadata governancedata lineage
0 likes · 5 min read
The Role of Metadata in Data Governance and Its Applications
DataFunSummit
DataFunSummit
May 13, 2023 · Big Data

Expert Interview on Data Governance: Core Domains, Challenges, and Future Trends

In this interview, three data‑governance experts from Tencent, ByteDance, and Alibaba discuss the fundamental processes, core domains such as metadata, data lineage, metric systems, data quality and security, the main challenges they face, and emerging trends like DataOps, AI‑driven automation, and privacy‑preserving technologies.

Big DataData SecurityDataOps
0 likes · 14 min read
Expert Interview on Data Governance: Core Domains, Challenges, and Future Trends
DataFunSummit
DataFunSummit
May 10, 2023 · Big Data

Field-Level Data Lineage Extraction for FlinkSQL Using Apache Calcite

This article explains how to derive field‑level data lineage for FlinkSQL by leveraging Apache Calcite, covering the Calcite framework, FlinkSQL execution stages, the three‑step parsing approach, core source code details, practical Insert/Join examples, and extensions for lookup joins and UDTFs.

Apache CalciteBig DataFlinkSQL
0 likes · 12 min read
Field-Level Data Lineage Extraction for FlinkSQL Using Apache Calcite