Tagged articles
6 articles
Page 1 of 1
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 14, 2023 · Big Data

Comprehensive Guide to Data Lineage: Model Design, Optimization, and Use Cases at ByteDance

This article presents an in‑depth overview of data lineage at ByteDance, detailing the design of storage, display, abstraction, implementation, and storage layers, optimization techniques for real‑time updates and queries, open export methods, practical use cases across asset, development, governance, and security domains, and future directions.

Apache AtlasData LineageJanusGraph
0 likes · 20 min read
Comprehensive Guide to Data Lineage: Model Design, Optimization, and Use Cases at ByteDance
DataFunTalk
DataFunTalk
Feb 26, 2023 · Big Data

Design, Optimization, and Use Cases of Data Lineage in ByteDance's DataLeap Platform

This article presents an in‑depth overview of DataLeap's data lineage capabilities, covering the challenges, multi‑layer model design, implementation with Apache Atlas and JanusGraph, performance optimizations, diverse use cases across asset, development, governance and security domains, and future trends for lineage technology.

Apache AtlasBig DataData Governance
0 likes · 19 min read
Design, Optimization, and Use Cases of Data Lineage in ByteDance's DataLeap Platform
ByteDance Data Platform
ByteDance Data Platform
Jun 8, 2022 · Backend Development

How ByteDance Optimized Data Catalog Performance with Apache Atlas and JanusGraph

This article details ByteDance's 2021 overhaul of its Data Catalog system, the performance regressions encountered after switching to Apache Atlas, and the step‑by‑step backend optimizations—including JanusGraph tuning, Gremlin query refactoring, parallel processing, and write‑path improvements—that reduced latency from minutes to seconds.

Apache AtlasData CatalogJanusGraph
0 likes · 12 min read
How ByteDance Optimized Data Catalog Performance with Apache Atlas and JanusGraph