ByteDance Data Platform
Author

ByteDance Data Platform

The ByteDance Data Platform team empowers all ByteDance business lines by lowering data‑application barriers, aiming to build data‑driven intelligent enterprises, enable digital transformation across industries, and create greater social value. Internally it supports most ByteDance units; externally it delivers data‑intelligence products under the Volcano Engine brand to enterprise customers.

78
Articles
0
Likes
187
Views
0
Comments
Recent Articles

Latest from ByteDance Data Platform

78 recent articles
ByteDance Data Platform
ByteDance Data Platform
Sep 14, 2022 · Fundamentals

Mastering Enterprise Data Tracking: A Step‑by‑Step Design Blueprint

This guide details how to plan, design, and manage enterprise‑level data tracking projects, covering role responsibilities, initial and iterative construction phases, event and attribute specifications, best‑practice tips, and common pitfalls to ensure accurate, maintainable analytics.

AnalyticsData TrackingProduct Management
0 likes · 16 min read
Mastering Enterprise Data Tracking: A Step‑by‑Step Design Blueprint
ByteDance Data Platform
ByteDance Data Platform
Sep 7, 2022 · Product Management

How to Calculate Minimum Sample Size for Reliable A/B Tests

This article explains common pain points in A/B testing, introduces essential statistical concepts such as sampling distribution, parameter estimation, confidence intervals, and hypothesis testing, and provides step‑by‑step formulas and a concrete example for calculating the minimum sample size needed to run a trustworthy experiment.

A/B testinghypothesis testingproduct experimentation
0 likes · 14 min read
How to Calculate Minimum Sample Size for Reliable A/B Tests
ByteDance Data Platform
ByteDance Data Platform
Aug 24, 2022 · Big Data

How ByteDance Guarantees Real‑Time Data Point Quality with Scalable Validation

This article explains ByteDance's end‑to‑end data‑point (埋点) validation system, covering its technical challenges—usability, accuracy, real‑time visibility, stability, and extensibility—along with SDK integration, QR‑code workflow, JSON‑Schema verification, push‑service architecture, SLA metrics, and future automation plans.

JSON SchemaPush ServiceSDK
0 likes · 11 min read
How ByteDance Guarantees Real‑Time Data Point Quality with Scalable Validation
ByteDance Data Platform
ByteDance Data Platform
Aug 22, 2022 · Databases

How ByteHouse Supercharges ClickHouse with Upsert, Joins, and High Availability

ByteHouse, built on ClickHouse, addresses key limitations such as missing upsert/delete, weak multi‑table joins, scalability issues, and lack of resource isolation by introducing a modular, stage‑based execution engine, advanced join strategies, runtime filters, and a custom optimizer, delivering dramatically faster query performance.

ByteHouseClickHouseMulti-Table Join
0 likes · 11 min read
How ByteHouse Supercharges ClickHouse with Upsert, Joins, and High Availability
ByteDance Data Platform
ByteDance Data Platform
Aug 19, 2022 · Product Management

How ByteDance Boosted New User Retention with Incentives and AB Testing

This article reviews ByteDance's practical growth case where the new video recommendation product “M” used a data‑driven incentive system and extensive AB testing to improve first‑week user retention, outlining the design, implementation steps, and methods for identifying core product functions.

AB testingProduct ManagementUser Retention
0 likes · 9 min read
How ByteDance Boosted New User Retention with Incentives and AB Testing
ByteDance Data Platform
ByteDance Data Platform
Jul 18, 2022 · Big Data

Unlocking Real‑Time Data Quality: ByteDance’s Dynamic Exploration Solution

This article explains how ByteDance’s dynamic data exploration tool improves data quality assurance by replacing time‑consuming SQL validation with real‑time, sample‑based profiling, detailing its problem background, core features, technical architecture, front‑end rendering techniques, operation‑stack management, and future enhancements.

SQL Generationbig datadata exploration
0 likes · 13 min read
Unlocking Real‑Time Data Quality: ByteDance’s Dynamic Exploration Solution
ByteDance Data Platform
ByteDance Data Platform
Jun 8, 2022 · Backend Development

How ByteDance Optimized Data Catalog Performance with Apache Atlas and JanusGraph

This article details ByteDance's 2021 overhaul of its Data Catalog system, the performance regressions encountered after switching to Apache Atlas, and the step‑by‑step backend optimizations—including JanusGraph tuning, Gremlin query refactoring, parallel processing, and write‑path improvements—that reduced latency from minutes to seconds.

Apache AtlasData CatalogJanusGraph
0 likes · 12 min read
How ByteDance Optimized Data Catalog Performance with Apache Atlas and JanusGraph
ByteDance Data Platform
ByteDance Data Platform
May 30, 2022 · Databases

How UniqueMergeTree Boosts Real-Time Updates in ClickHouse Column Stores

UniqueMergeTree, a new ClickHouse table engine, addresses real‑time data update challenges by combining upsert semantics, unique key enforcement, and efficient delete‑bitmap handling, offering higher query performance at modest write cost, with detailed design, sharding strategies, conflict resolution, and performance evaluation.

ClickHouseColumnar StorageDatabase Engine
0 likes · 14 min read
How UniqueMergeTree Boosts Real-Time Updates in ClickHouse Column Stores