Tagged articles
58 articles
Page 1 of 1
Smart Workplace Lab
Smart Workplace Lab
Apr 26, 2026 · Industry Insights

How to Get Other Departments to Accept Your AI‑Built Model: A Cross‑Domain Mutual‑Recognition Protocol

When an AI‑generated conversion‑rate model is rejected by other teams due to mismatched data definitions, this article shows how to align measurement criteria by making data lineage and calculation scope explicit, using a three‑step cross‑domain mutual‑recognition protocol, checklists, and arbitration scripts to reduce internal friction.

AI model governanceData AlignmentData Lineage
0 likes · 6 min read
How to Get Other Departments to Accept Your AI‑Built Model: A Cross‑Domain Mutual‑Recognition Protocol
DataFunSummit
DataFunSummit
Mar 25, 2026 · Big Data

How Apache Gravitino and OpenLineage Transform Data Governance for AI‑Driven Enterprises

In the era of AI and multi‑cloud, this article analyzes the core challenges of data governance—data silos, quality gaps, and compliance risks—and explains how Apache Gravitino’s unified metadata architecture together with OpenLineage’s standardized lineage model provide a scalable, automated solution for intelligent, real‑time data management.

Apache GravitinoBig DataData Governance
0 likes · 15 min read
How Apache Gravitino and OpenLineage Transform Data Governance for AI‑Driven Enterprises
DataFunTalk
DataFunTalk
Dec 17, 2025 · Artificial Intelligence

How Large Language Models Unlock Field‑Level Data Lineage at Scale

This talk explains how a data platform tackled massive, heterogeneous enterprise data by using large language models and prompt engineering to automatically extract field‑level lineage from SQL scripts, achieve over 80% coverage, and raise accuracy above 95%, dramatically cutting impact‑analysis time.

AI for data engineeringBig DataData Lineage
0 likes · 6 min read
How Large Language Models Unlock Field‑Level Data Lineage at Scale
DataFunSummit
DataFunSummit
Oct 29, 2025 · Big Data

Douyin’s Data Asset Platform: Transforming Big Data Lineage

This article introduces Douyin Group’s Data Asset Management Platform, explains its shift from traditional metadata to comprehensive data assets, and details the evolution, architecture, and applications of its full‑linkage data lineage, highlighting why building accurate, real‑time lineage is critical for quality, security, and cost efficiency.

Data Asset PlatformData LineageDouyin
0 likes · 5 min read
Douyin’s Data Asset Platform: Transforming Big Data Lineage
DataFunSummit
DataFunSummit
Oct 22, 2025 · Big Data

How Douyin’s Data Asset Platform Revolutionizes Big Data Lineage

This article introduces Douyin Group’s comprehensive data asset management platform, explains why it emphasizes data assets over raw metadata, outlines its full‑linkage lineage capabilities, and presents practical insights on building, applying, and future‑proofing big data lineage within complex enterprise environments.

Big DataData Asset ManagementData Lineage
0 likes · 5 min read
How Douyin’s Data Asset Platform Revolutionizes Big Data Lineage
DataFunSummit
DataFunSummit
Oct 19, 2025 · Big Data

How Apache Gravitino and OpenLineage Transform Data Governance in the AI Era

This article explains how the rapid rise of AI and large‑model technologies is driving a paradigm shift in data governance toward intelligent, automated, and real‑time collaboration, outlines the challenges of multi‑cloud environments, and demonstrates how Apache Gravitino and OpenLineage provide a unified metadata and lineage solution that improves data quality, compliance, and business agility.

Apache GravitinoBig DataData Lineage
0 likes · 12 min read
How Apache Gravitino and OpenLineage Transform Data Governance in the AI Era
DataFunSummit
DataFunSummit
Oct 14, 2025 · Big Data

How Douyin’s Data Asset Platform Redefines Big Data Lineage

This article introduces Douyin Group’s one‑stop Data Asset Management Platform, explains why the company focuses on data assets rather than raw metadata, and details the evolution, architecture, applications, and future outlook of its comprehensive big‑data lineage system.

Big DataData Asset ManagementData Governance
0 likes · 5 min read
How Douyin’s Data Asset Platform Redefines Big Data Lineage
DataFunSummit
DataFunSummit
Oct 12, 2025 · Big Data

How Douyin’s Data Asset Platform Revolutionizes Big Data Lineage

This article introduces Douyin Group’s Data Asset Management Platform, explaining its shift from traditional metadata to comprehensive data assets, detailing the evolution, architecture, and applications of its full‑link big data lineage, and offering strategic guidance for building effective lineage systems.

Data AssetData GovernanceData Lineage
0 likes · 5 min read
How Douyin’s Data Asset Platform Revolutionizes Big Data Lineage
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Sep 22, 2025 · Big Data

How Dataverse’s Notebook Supercharges Data+AI Development at Xiaohongshu

The article details Xiaohongshu’s Dataverse platform evolution into a Data+AI system, highlighting inefficiencies in algorithm and data‑science workflows, the introduction of an interactive notebook, comprehensive data lineage, AI‑coding assistance, and future DataAgent plans to automate data engineering tasks.

AI CodingData LineageData Platform
0 likes · 21 min read
How Dataverse’s Notebook Supercharges Data+AI Development at Xiaohongshu
DataFunSummit
DataFunSummit
Jun 6, 2025 · Big Data

How Unicom Digital’s Integrated Data Platform Revolutionizes Metadata Management

This article details Unicom Digital’s metadata management practice on its integrated data platform, covering the strategic background of data, key challenges, award-winning capabilities, three-pronged solutions—automation, linking+, and AI—along with practical implementations, full‑chain lineage, data responsibility, lifecycle management, and future AI‑driven enhancements.

Big DataData GovernanceData Lineage
0 likes · 18 min read
How Unicom Digital’s Integrated Data Platform Revolutionizes Metadata Management
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 1, 2025 · Big Data

Douyin Group Data Asset Management Platform: Comprehensive Data Lineage Overview and Practices

This article presents a detailed overview of Douyin Group's Data Asset Management Platform, focusing on the evolution, architecture, modeling, metrics, and application scenarios of its large‑scale data lineage system, and outlines future directions for full‑coverage, fine‑grained lineage capabilities.

Big DataData Asset ManagementData Lineage
0 likes · 17 min read
Douyin Group Data Asset Management Platform: Comprehensive Data Lineage Overview and Practices
DataFunSummit
DataFunSummit
Jan 1, 2025 · Big Data

Douyin Group Data Asset Management Platform: Full‑Stack Data Lineage Evolution and Applications

This article introduces Douyin Group’s end‑to‑end data asset management platform, explains the evolution and architecture of its large‑scale data lineage system, presents quality metrics and ecosystem components, and outlines practical applications and future directions for data governance, development, and security.

Data Asset PlatformData GovernanceData Lineage
0 likes · 16 min read
Douyin Group Data Asset Management Platform: Full‑Stack Data Lineage Evolution and Applications
JD Retail Technology
JD Retail Technology
Dec 19, 2024 · Big Data

JD.com Data Governance: Architecture, Key Technologies, and Future Directions

JD.com’s data‑governance framework combines a health‑score‑driven, automated platform that cross‑verifies audit logs, builds full‑link and operator‑level lineage, introduces standard fields, and optimizes resource mixing, task staggering, and cross‑datacenter scheduling, while targeting real‑time AI‑enhanced detection and full automation.

Data GovernanceData LineageJD.com
0 likes · 15 min read
JD.com Data Governance: Architecture, Key Technologies, and Future Directions
DataFunSummit
DataFunSummit
Dec 10, 2024 · Big Data

JD.com’s Big Data Governance: Practices, Key Technologies, and Future Outlook

This article presents JD.com’s comprehensive big‑data governance experience, detailing the background and challenges, the automated governance platform and its core technologies such as audit logs and full‑link lineage, strategies for resource optimization, and the roadmap toward real‑time, intelligent, and fully automated data governance.

Data GovernanceData LineageJD.com
0 likes · 14 min read
JD.com’s Big Data Governance: Practices, Key Technologies, and Future Outlook
Huolala Tech
Huolala Tech
Dec 5, 2024 · Big Data

Huolala’s Metadata Platform: Scaling Data Lineage, AI Search & Cost Governance

Huolala’s data team details the evolution of its metadata management platform—covering architecture, stages from early Hive‑ETL to real‑time field‑level lineage, AI‑driven smart search, cost‑governance mechanisms, and security classifications—showcasing practical solutions for data discoverability, efficiency, and protection at scale.

AI searchData Lineagecost governance
0 likes · 27 min read
Huolala’s Metadata Platform: Scaling Data Lineage, AI Search & Cost Governance
ByteDance Data Platform
ByteDance Data Platform
Nov 27, 2024 · Big Data

Inside Douyin’s Data Asset Platform: Transforming Data Lineage and Governance

Douyin Group’s data asset management platform introduces a systematic "manage, find, use" approach that unifies metadata collection, full‑coverage data lineage, and a suite of applications across development, governance, asset utilization, and security, while outlining its architecture, modeling, quality metrics, and future roadmap.

Data GovernanceData Lineagemetadata management
0 likes · 14 min read
Inside Douyin’s Data Asset Platform: Transforming Data Lineage and Governance
DataFunTalk
DataFunTalk
Nov 10, 2024 · Big Data

Douyin Group Data Asset Management Platform and Data Lineage Architecture Overview

This article provides a comprehensive overview of Douyin Group's data asset management platform, detailing the evolution, architecture, and applications of its large‑scale data lineage system, and discusses future directions for enhancing data quality, cost efficiency, and security across the organization.

Data GovernanceData Lineagemetadata management
0 likes · 15 min read
Douyin Group Data Asset Management Platform and Data Lineage Architecture Overview
DataFunTalk
DataFunTalk
Oct 11, 2024 · Artificial Intelligence

E‑commerce Innovation and Data Governance: Summaries of Recent Research Topics

This article compiles concise overviews of recent e‑commerce research, covering real‑time online learning re‑ranking models, causal inference for user growth, full‑link data lineage, TikTok's data governance and attribution solutions, Volcano Engine's metric management, AI Agent applications on 1688, and XinXuan Group's live‑stream data architecture.

Data GovernanceData LineageOnline Learning
0 likes · 5 min read
E‑commerce Innovation and Data Governance: Summaries of Recent Research Topics
DataFunSummit
DataFunSummit
Aug 28, 2024 · Big Data

Building Data Lineage Foundations and Applications for E‑commerce Scenarios

This article explains how to construct a full‑link data lineage platform for e‑commerce, detailing its architecture, quality metrics, and practical uses such as table migration, field‑level tracing, and automated metric decomposition to improve data governance and efficiency.

Data GovernanceData Lineagee‑commerce
0 likes · 14 min read
Building Data Lineage Foundations and Applications for E‑commerce Scenarios
Data Thinking Notes
Data Thinking Notes
Jul 11, 2024 · Big Data

How to Build a Robust Data Lineage Foundation for Scalable Business Insights

This article explains how to construct a full‑chain data lineage system, covering its overall architecture, quality measurement framework, and application layer, and demonstrates practical use cases such as handling data growth, monitoring warehouse changes, accelerating development, ensuring consistency, and automating metric decomposition in real‑world business scenarios.

Big DataData GovernanceData Lineage
0 likes · 14 min read
How to Build a Robust Data Lineage Foundation for Scalable Business Insights
Data Thinking Notes
Data Thinking Notes
Jul 9, 2024 · Big Data

How to Build a Robust Enterprise Data Asset Catalog for Better Governance

This article explains why a comprehensive data asset catalog is essential for modern enterprises, outlines its core components such as inventory, metadata, data lineage, standards and access control, details step‑by‑step construction methods, and highlights key applications in governance, quality, compliance, architecture and valuation.

Big DataData CatalogData Governance
0 likes · 13 min read
How to Build a Robust Enterprise Data Asset Catalog for Better Governance
DataFunTalk
DataFunTalk
Jun 23, 2024 · Big Data

Building Full-Chain Data Lineage for E‑commerce Scenarios

This article explains how to construct a full‑chain data lineage system for e‑commerce, covering the concepts of data lineage, the design of a lineage foundation, quality measurement, application‑level lineage, and practical use cases such as table migration, field‑level tracing, and automated metric decomposition.

Data LineageData Qualitye‑commerce
0 likes · 12 min read
Building Full-Chain Data Lineage for E‑commerce Scenarios
DataFunTalk
DataFunTalk
Mar 9, 2024 · Big Data

Construction and Application of Tencent Oula Data Lineage Platform

This article presents a comprehensive overview of Tencent Oula's data lineage system, detailing its background, goals, architecture, modular construction, key technologies such as graph databases and SQL parsing, and various internal application scenarios including data governance, cost insight, and baseline monitoring.

Data LineageGraph DatabaseSQL parsing
0 likes · 20 min read
Construction and Application of Tencent Oula Data Lineage Platform
DataFunTalk
DataFunTalk
Nov 23, 2023 · Big Data

Tencent PCG Data Governance System: Architecture, Asset Scoring, and One‑Stop Governance Platform

The article presents Tencent PCG's comprehensive data governance solution, detailing the challenges of massive, heterogeneous data, the four‑chapter framework covering governance overview, meta‑warehouse construction, an open asset‑scoring system, and a one‑stop governance workbench, and explains how lineage, scoring, and rule‑engine mechanisms enable cost‑effective, continuous data governance.

Asset ScoringBig DataData Governance
0 likes · 14 min read
Tencent PCG Data Governance System: Architecture, Asset Scoring, and One‑Stop Governance Platform
HomeTech
HomeTech
Nov 15, 2023 · Industry Insights

How to Build Accurate Data Asset Lineage for Data Warehouse Governance

This article explains the challenges of data asset lineage in large data warehouses, presents a comprehensive approach using business‑level instrumentation, SQL interceptor plugins, and ETL script parsing to generate fine‑grained lineage graphs, and demonstrates measurable improvements in coverage and zombie‑table cleanup.

Data GovernanceData LineageData Quality
0 likes · 18 min read
How to Build Accurate Data Asset Lineage for Data Warehouse Governance
Bilibili Tech
Bilibili Tech
Sep 15, 2023 · Big Data

Introducing Bilibili's SQLScan: Architecture, Key Technologies, and Production Impact

Bilibili's SQLScan is a static‑code analysis tool that parses Hive, Spark, Presto and Flink SQL via Antlr4, builds a unified AST, applies engine‑specific metadata plugins for rule enforcement, provides field‑lineage and cost‑analysis services, and has processed hundreds of thousands of daily queries, intercepting thousands of problematic statements to improve data quality and operational efficiency.

Big DataBilibiliData Lineage
0 likes · 11 min read
Introducing Bilibili's SQLScan: Architecture, Key Technologies, and Production Impact
Weimob Technology Center
Weimob Technology Center
Aug 1, 2023 · Big Data

How Weimeng Transformed Data Asset Governance: A Practical Blueprint for Enterprises

Facing fragmented metadata, unclear ownership, and costly data duplication, Weimeng implemented a comprehensive data asset governance framework—covering metadata standards, lineage visualization, metric normalization, and cost management—to boost data quality, security, and business value across its new‑retail platform.

Data GovernanceData Lineagedata operations
0 likes · 15 min read
How Weimeng Transformed Data Asset Governance: A Practical Blueprint for Enterprises
DataFunSummit
DataFunSummit
Jun 29, 2023 · Big Data

iQIYI Data Link Governance: Offline and Real‑time Pipeline Management and Exploration

This article presents iQIYI’s comprehensive data link governance practice, covering the motivations, offline and real‑time pipeline governance strategies, monitoring mechanisms, data lineage, and exploratory work such as intelligent attribution and field‑level lineage to improve data accuracy, timeliness, and reliability.

Data GovernanceData LineageiQIYI
0 likes · 11 min read
iQIYI Data Link Governance: Offline and Real‑time Pipeline Management and Exploration
政采云技术
政采云技术
Jun 15, 2023 · Big Data

Optimizing Data Lineage Extraction Using Spline REST API

This article discusses the practical implementation of extracting table and field lineage information via the Spline REST API, analyzing API call frequency, server load tolerance, and the strategy of re-parsing lineage only when job versions change to optimize performance.

Data LineageREST APISpline
0 likes · 5 min read
Optimizing Data Lineage Extraction Using Spline REST API
Data Thinking Notes
Data Thinking Notes
May 31, 2023 · Big Data

Why Data Lineage Is Essential: From Concepts to Practical Implementation

This article explains what data lineage is, its components, why it matters for data quality, security, and operational efficiency, and provides a comprehensive implementation guide covering open‑source tools, commercial platforms, custom builds, graph‑database modeling, automatic and manual lineage capture, visualization, analytics, and evaluation metrics.

Data GovernanceData LineageETL
0 likes · 18 min read
Why Data Lineage Is Essential: From Concepts to Practical Implementation
DataFunSummit
DataFunSummit
May 13, 2023 · Big Data

Expert Interview on Data Governance: Core Domains, Challenges, and Future Trends

In this interview, three data‑governance experts from Tencent, ByteDance, and Alibaba discuss the fundamental processes, core domains such as metadata, data lineage, metric systems, data quality and security, the main challenges they face, and emerging trends like DataOps, AI‑driven automation, and privacy‑preserving technologies.

Data GovernanceData LineageData Quality
0 likes · 14 min read
Expert Interview on Data Governance: Core Domains, Challenges, and Future Trends
DataFunSummit
DataFunSummit
May 10, 2023 · Big Data

Field-Level Data Lineage Extraction for FlinkSQL Using Apache Calcite

This article explains how to derive field‑level data lineage for FlinkSQL by leveraging Apache Calcite, covering the Calcite framework, FlinkSQL execution stages, the three‑step parsing approach, core source code details, practical Insert/Join examples, and extensions for lookup joins and UDTFs.

Apache CalciteData LineageFlinkSQL
0 likes · 12 min read
Field-Level Data Lineage Extraction for FlinkSQL Using Apache Calcite
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 14, 2023 · Big Data

Comprehensive Guide to Data Lineage: Model Design, Optimization, and Use Cases at ByteDance

This article presents an in‑depth overview of data lineage at ByteDance, detailing the design of storage, display, abstraction, implementation, and storage layers, optimization techniques for real‑time updates and queries, open export methods, practical use cases across asset, development, governance, and security domains, and future directions.

Apache AtlasData LineageJanusGraph
0 likes · 20 min read
Comprehensive Guide to Data Lineage: Model Design, Optimization, and Use Cases at ByteDance
Data Thinking Notes
Data Thinking Notes
Feb 2, 2023 · Fundamentals

Why Metadata Management Is the Key to Unlocking Data Value

This article explains how effective metadata management provides context, improves data quality, enables data lineage tracing, supports governance, and ultimately turns raw data into valuable assets for enterprises navigating complex, evolving data environments.

Data GovernanceData LineageData Management
0 likes · 35 min read
Why Metadata Management Is the Key to Unlocking Data Value
DeWu Technology
DeWu Technology
Nov 30, 2022 · Big Data

Fundamentals and Implementation of Data Lineage in Big Data Environments

Data lineage in big‑data environments tracks how data moves and transforms—from source tables through SQL processing to final storage—enabling management tasks such as domain segmentation, performance tuning, anomaly detection, and dependency verification, with implementations ranging from simple regex extraction to robust AST parsing and optimization, as used by tools like Alibaba DataWorks and Apache Atlas.

ASTBig DataData Lineage
0 likes · 7 min read
Fundamentals and Implementation of Data Lineage in Big Data Environments
DataFunTalk
DataFunTalk
Oct 26, 2022 · Big Data

Metadata Management and Governance Practices at Wing Payment: Architecture, Techniques, and Future Outlook

This article explains how metadata serves as the foundation of enterprise data governance, outlines common data governance challenges, describes Wing Payment's metadata governance framework and platform architecture, and presents future directions such as multi‑source management, cross‑cluster disaster recovery, and intelligent recommendation.

Big DataData GovernanceData Lineage
0 likes · 18 min read
Metadata Management and Governance Practices at Wing Payment: Architecture, Techniques, and Future Outlook
Youzan Coder
Youzan Coder
Sep 29, 2022 · Big Data

Implementing Spark Data Lineage with Spline: A Step‑by‑Step Guide

This article explains the growing importance of data lineage in large data warehouses, evaluates three Spark lineage extraction approaches, and provides a detailed, step‑by‑step guide to integrating the open‑source Spline agent—including codeless and programmatic initialization, configuration, dispatcher setup, post‑processing, and known limitations.

Apache SparkBig DataData Governance
0 likes · 16 min read
Implementing Spark Data Lineage with Spline: A Step‑by‑Step Guide
Big Data Technology Architecture
Big Data Technology Architecture
Jun 29, 2022 · Fundamentals

Deriving Data Lineage from Python Code Using AST and Pyflakes

This article explains how to automatically extract data lineage and code dependencies from large collections of Python scripts by leveraging the language's compilation stages, abstract syntax trees, and the Pyflakes static‑analysis library, providing practical code examples and custom parsers for SQL extraction.

ASTBig DataCode Parsing
0 likes · 12 min read
Deriving Data Lineage from Python Code Using AST and Pyflakes
Ctrip Technology
Ctrip Technology
Jun 24, 2022 · Databases

Practical Experience of Nebula Graph in Ctrip Finance: Architecture, Use Cases, and Optimizations

This article describes how Ctrip Finance built a large‑scale Nebula Graph platform for financial risk control, data lineage, and fraud detection, detailing the system architecture, real‑world applications, performance challenges, and the engineering optimizations applied to achieve sub‑15 ms query latency.

Data LineageGraph DatabaseNebula Graph
0 likes · 18 min read
Practical Experience of Nebula Graph in Ctrip Finance: Architecture, Use Cases, and Optimizations
Meituan Technology Team
Meituan Technology Team
Jun 16, 2022 · Artificial Intelligence

Building a Quality Model for Meituan's Recommendation System

This article presents a request‑granularity quality model for Meituan's integrated recommendation system, linking data tables, algorithm models, services, and user requests, and details its metrics, defect taxonomy, calculation formulas, data‑lineage expansion, implementation, alert routing, and operational outcomes.

Data LineageMeituanQuality Modeling
0 likes · 22 min read
Building a Quality Model for Meituan's Recommendation System
Architect
Architect
May 25, 2022 · Big Data

Metadata Infrastructure and Governance in Bilibili's Data Platform

The article details how Bilibili built a unified metadata infrastructure—including a URN‑based model, collection pipelines, quality assurance, storage in TiDB/ES/HugeGraph, and query services—to support data discovery, lineage, impact analysis, and governance across its growing data platform.

Big DataData CatalogData Governance
0 likes · 21 min read
Metadata Infrastructure and Governance in Bilibili's Data Platform
Bilibili Tech
Bilibili Tech
May 24, 2022 · Big Data

Metadata Infrastructure and Governance in Bilibili Data Platform

Bilibili’s data platform consolidates scattered metadata into a unified URN‑based model stored across TiDB, Elasticsearch, and HugeGraph, offering batch‑pull and embedded collection, flexible SQL‑like queries, comprehensive lineage mapping, and powering data‑map, lineage‑map, and impact‑analysis tools while planning expanded quality assurance and self‑service dictionaries.

Data GovernanceData LineageData Platform
0 likes · 21 min read
Metadata Infrastructure and Governance in Bilibili Data Platform
ByteDance Data Platform
ByteDance Data Platform
Apr 27, 2022 · Big Data

How ByteDance Built a Scalable Data Catalog: Key Technologies and Future Plans

ByteDance’s Data Catalog article details the system’s unified metadata model, standardized ingestion connectors, search optimization techniques, lineage capabilities, and storage layer enhancements, highlighting key technical designs, performance improvements, and future work to advance data governance and asset utilization.

Data CatalogData LineageStorage Optimization
0 likes · 12 min read
How ByteDance Built a Scalable Data Catalog: Key Technologies and Future Plans
DataFunSummit
DataFunSummit
Dec 14, 2021 · Big Data

Data Map: Background, Definition, and Youzan’s Practical Implementation

This article introduces the concept of a data map, explains its background and goals, describes Youzan’s end‑to‑end data‑map practice—including full data lineage, search, management, link analysis, impact estimation, and optimization—and concludes with a summary and future outlook.

Big DataData GovernanceData Lineage
0 likes · 16 min read
Data Map: Background, Definition, and Youzan’s Practical Implementation
DataFunTalk
DataFunTalk
Feb 2, 2021 · Big Data

Metadata Management: Concepts, Architecture, and Applications in Data Warehousing

This article explains the fundamentals and value of metadata, describes a comprehensive metadata management system and its layered architecture, outlines key technologies such as automatic SQL metadata extraction, and showcases practical applications like metadata query, impact analysis, data lineage, and business‑driven data needs within modern data warehouses.

Data LineageSQL parsingdata-warehouse
0 likes · 17 min read
Metadata Management: Concepts, Architecture, and Applications in Data Warehousing
DataFunTalk
DataFunTalk
Jan 21, 2021 · Big Data

Kuaishou Metadata Platform: Evolution, Architecture, and Application Scenarios

This article introduces the development history, current architecture, abstraction methods, and key application scenarios of Kuaishou's metadata platform, highlighting challenges such as heterogeneous data integration, large-scale asset management, and the platform's role in data search, lineage, governance, and future enhancements.

Data LineageKuaishouSearch
0 likes · 16 min read
Kuaishou Metadata Platform: Evolution, Architecture, and Application Scenarios
DataFunTalk
DataFunTalk
Dec 19, 2020 · Big Data

Evolution of iQIYI Data Warehouse from 1.0 to 2.0: Architecture, Modeling Practices, and Future Directions

This article details iQIYI's transition from a fragmented Data Warehouse 1.0 to a unified, standardized Data Warehouse 2.0, covering layered architecture, dimension and metric design, modeling workflows, metadata management, data lineage, and upcoming intelligent and automated data platform initiatives.

Data Lineagedata modelingdata-warehouse
0 likes · 25 min read
Evolution of iQIYI Data Warehouse from 1.0 to 2.0: Architecture, Modeling Practices, and Future Directions
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 13, 2020 · Big Data

Evolution of iQIYI Data Warehouse from 1.0 to 2.0: Architecture, Modeling, Metadata, and Data Lineage

The talk chronicles iQIYI’s shift from a fragmented five‑layer Data Warehouse 1.0 to a unified 2.0 architecture featuring a central Dimension Layer, business‑focused data marts, and subject‑oriented warehouses, while detailing platform services, rigorous metadata management, lineage tracking, and future goals of intelligent, automated, service‑oriented, model‑driven data governance.

Data Lineagedata modelingiQIYI
0 likes · 23 min read
Evolution of iQIYI Data Warehouse from 1.0 to 2.0: Architecture, Modeling, Metadata, and Data Lineage
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Sep 3, 2019 · Big Data

Practical Experiences and Lessons Learned in Building a Flink‑Based Real‑Time Computing Platform at Tongcheng‑Elong

This article details the design, implementation, and optimization of a Flink‑based real‑time computing platform at Tongcheng‑Elong, covering the evolution from Storm to Flink, support for FlinkSQL and FlinkStream, metric collection, logging, data lineage, savepoint management, and numerous stability fixes contributed back to the open‑source community.

Big DataData LineageFlink
0 likes · 16 min read
Practical Experiences and Lessons Learned in Building a Flink‑Based Real‑Time Computing Platform at Tongcheng‑Elong
dbaplus Community
dbaplus Community
Jul 25, 2018 · Big Data

How Ele.me Built a Scalable Metadata Governance System for Big Data

This article explains how Ele.me tackles big‑data challenges by designing a metadata governance platform that collects SQL execution data, parses lineage with Antlr, stores graph relationships in Neo4j, and enables table/column lineage queries, DAG scheduling, and hot‑data analysis.

Data LineageEle.meGraph Database
0 likes · 12 min read
How Ele.me Built a Scalable Metadata Governance System for Big Data
Architecture Digest
Architecture Digest
Sep 2, 2017 · Big Data

Designing a High‑Availability, High‑Efficiency Distributed Scheduling Platform for Big Data

This article examines the principles, features, and implementation details of distributed scheduling for big‑data ETL pipelines, covering decentralised schedulers, host selection strategies, fault‑tolerance, operator abstraction, elasticity, trigger mechanisms, visual monitoring, alarm handling, data fan‑in/fan‑out, parameter consistency, real‑time quality checks, lineage tracking, and field‑level traceability.

Big DataData LineageDistributed Scheduling
0 likes · 23 min read
Designing a High‑Availability, High‑Efficiency Distributed Scheduling Platform for Big Data
StarRing Big Data Open Lab
StarRing Big Data Open Lab
May 27, 2017 · Big Data

Simplify Big Data Governance with Data Lineage & Impact Analysis

Enterprise big‑data platforms face massive scale and complex metadata relationships, but using Transwarp Governor’s data lineage and impact analysis graphs enables precise tracing of data origins, rapid error localization, and prediction of downstream effects, dramatically improving data quality and governance efficiency.

Big DataData GovernanceData Lineage
0 likes · 8 min read
Simplify Big Data Governance with Data Lineage & Impact Analysis