Databases 6 min read

From ClickHouse to ByteHouse: Technical Optimizations and Production Practices

The whitepaper “From ClickHouse to ByteHouse” details ByteDance’s large‑scale deployment of ClickHouse, the challenges of moving it to production, and the key optimizations ByteHouse introduces—including custom table engines, a revamped query optimizer, and elastic compute‑storage separation—to achieve petabyte‑level OLAP performance.

DataFunTalk
DataFunTalk
DataFunTalk
From ClickHouse to ByteHouse: Technical Optimizations and Production Practices

Recently, Volcano Engine’s ByteHouse team and InfoQ jointly released the whitepaper “From ClickHouse to ByteHouse”, focusing on the problems encountered when introducing ClickHouse into enterprise‑grade production environments and the current solutions.

ClickHouse, open‑sourced in 2016, quickly became a leading analytical database due to its outstanding performance, and many leading companies worldwide now use it extensively.

In terms of performance, ClickHouse’s OLAP capabilities surpass comparable products by several times, allowing sub‑second latency reporting on petabyte‑scale raw data and achieving throughput of hundreds of millions of rows per second.

However, bringing ClickHouse into enterprise production still presents challenges. Not every team can afford to encounter the pitfalls of real‑world deployment, so sharing experience and choosing practical solutions—whether self‑developed or purchased—is essential.

ByteDance is a prime example: since 2017 it has been using ClickHouse at massive scale, operating the largest ClickHouse cluster in China.

Today, ByteDance’s internal ClickHouse nodes exceed 18,000, managing over 700 PB of data, with the largest single cluster comprising about 2,400 nodes.

After five years of custom modifications, ByteDance has evolved ClickHouse into ByteHouse, now offered as a commercial service through Volcano Engine.

The journey from adopting and customizing an open‑source product to launching a commercial version is arduous, making the insights and experiences highly valuable.

The jointly released whitepaper introduces the technical implementation behind ByteDance’s massive ClickHouse deployment and is divided into four chapters:

1. ClickHouse Introduction

2. Typical ClickHouse Scenarios

3. ByteHouse’s Technical Optimizations for Production‑grade ClickHouse

4. ByteHouse’s Design and Evolution

Starting from Chapter 3, the whitepaper focuses on ByteHouse’s optimization ideas.

ByteHouse has performed many upgrades and optimizations on ClickHouse; this summary highlights three particularly important areas:

1. Self‑developed Table Engine

2. Query Optimizer

3. Elastic Scalability

Regarding the self‑developed table engine, although ClickHouse provides dozens of engines such as MergeTree Family, Memory, File, and Interface, ByteDance found them insufficient for its business needs, prompting targeted enhancements.

The whitepaper details three custom engines: HaMergeTree, HaUniqueMergeTree, and HaKafka.

Figure 1: HaMergeTree replica coordination principle (excerpt from the whitepaper)

In the query optimizer module, ByteHouse invested over a year to revamp the Optimizer, fully upgrading its capabilities; the whitepaper enumerates these enhancements.

To achieve extreme performance, ClickHouse originally couples compute and storage nodes tightly, preventing independent scaling and causing operational difficulties when expanding nodes because data does not automatically rebalance.

ByteHouse addresses these issues by decoupling storage and compute, implementing an elastic, scalable architecture.

Figure 2: Compute‑storage separation architecture (excerpt from the whitepaper)

Additionally, the whitepaper presents case studies from advertising, finance, and industrial internet—typical OLAP application domains—and outlines three core considerations for enterprises selecting an OLAP data engine.

Click “Read Original” to download the full whitepaper.

scalabilityClickHouseOLAPQuery OptimizerTable EngineByteHouseAnalytical Databases
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.