How ByteHouse Supercharges ClickHouse with Upsert, Joins, and High Availability
ByteHouse, built on ClickHouse, addresses key limitations such as missing upsert/delete, weak multi‑table joins, scalability issues, and lack of resource isolation by introducing a modular, stage‑based execution engine, advanced join strategies, runtime filters, and a custom optimizer, delivering dramatically faster query performance.
Background
ByteDance has widely deployed ClickHouse internally and, based on the original engine, rebuilt its architecture to create a more powerful analytical platform. This series of articles introduces the enhancements made to ClickHouse in areas such as upsert, multi‑table joins, and availability.
Limitations of ClickHouse in Production
Missing complete upsert and delete operations.
Weak multi‑table join capabilities.
Availability degrades as cluster size grows, especially for ByteDance.
No resource isolation.
ByteHouse Enhancement Plan
To overcome these issues, ByteHouse strengthens ClickHouse across five dimensions: upsert support, multi‑table join capability, query optimization, high availability, and resource isolation.
Improving Multi‑Table Join Capability
Traditional wide tables flatten multiple tables into one, improving latency but causing data redundancy and high maintenance cost. ByteHouse aims to make wide‑table steps optional by providing powerful multi‑table join support.
Execution Model
ClickHouse normally executes queries in two stages. ByteHouse replaces this with a stage‑based model that splits a complex query into multiple stages, exchanging data only between stages.
Stage Data Exchange Types
Shuffle by one or multiple keys.
Gather: data from many nodes aggregated to a single node.
Broadcast: replicate the same data to multiple nodes.
Join Strategies Supported by ByteHouse
Shuffle Join – the most general join.
Broadcast Join – broadcast the small table to all workers for large‑table‑small‑table joins.
Colocate Join – reduces data transfer when both tables share the same distribution on the join key.
Optimizing the Join Operator
The join operator is often the most time‑consuming in OLAP engines. Optimization can be achieved by improving the join algorithm (e.g., better hash tables, parallelism) and by reducing the amount of data participating in the join. Runtime filters further prune irrelevant rows on the probe side before the join.
Performance Benchmarks
Using the SSB 100G benchmark without wide tables, ClickHouse 22.2.3.1 often fails or exceeds 60 seconds, while ByteHouse 2.0.1 consistently returns results within 1 second.
Case 1: Hash Join with Large Right Table
Execution time reduced from 17.210 s to 1.749 s after optimization.
<code>SELECT
sum(LO_REVENUE) - sum(LO_SUPPLYCOST) AS profit
FROM customer
INNER JOIN (
SELECT LO_REVENUE, LO_SUPPLYCOST, LO_CUSTKEY
FROM lineorder
WHERE toYear(LO_ORDERDATE) = 1997 AND toMonth(LO_ORDERDATE) = 1
) AS lineorder ON LO_CUSTKEY = C_CUSTKEY
WHERE C_REGION = 'AMERICA';</code>Case 2: Five‑Table Join (without runtime filter)
Execution time reduced from 8.583 s to 4.464 s.
<code>SELECT
D_YEAR,
S_CITY,
P_BRAND,
sum(LO_REVENUE) - sum(LO_SUPPLYCOST) AS profit
FROM ssb1000.lineorder
INNER JOIN (
SELECT C_CUSTKEY FROM ssb1000.customer WHERE C_REGION = 'AMERICA'
) AS customer ON LO_CUSTKEY = C_CUSTKEY
INNER JOIN (
SELECT D_DATEKEY, D_YEAR FROM date WHERE (D_YEAR = 1997) OR (D_YEAR = 1998)
) AS dates ON LO_ORDERDATE = toDate(D_DATEKEY)
INNER JOIN (
SELECT S_SUPPKEY, S_CITY FROM ssb1000.supplier WHERE S_NATION = 'UNITED STATES'
) AS supplier ON LO_SUPPKEY = S_SUPPKEY
INNER JOIN (
SELECT P_PARTKEY, P_BRAND FROM ssb1000.part WHERE P_CATEGORY = 'MFGR#14'
) AS part ON LO_PARTKEY = P_PARTKEY
GROUP BY D_YEAR, S_CITY, P_BRAND
ORDER BY D_YEAR ASC, S_CITY ASC, P_BRAND ASC
SETTINGS enable_distributed_stages = 1, exchange_source_pipeline_threads = 32;</code>Conclusion
With the enhanced multi‑table join capability, ByteHouse can support a wide range of business scenarios, allowing users to choose between wide‑table and direct join approaches while still achieving excellent analytical performance. The next article will detail the custom query optimizer developed by ByteDance.
ByteDance Data Platform
The ByteDance Data Platform team empowers all ByteDance business lines by lowering data‑application barriers, aiming to build data‑driven intelligent enterprises, enable digital transformation across industries, and create greater social value. Internally it supports most ByteDance units; externally it delivers data‑intelligence products under the Volcano Engine brand to enterprise customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.