Databases 14 min read

How UniqueMergeTree Boosts Real-Time Updates in ClickHouse Column Stores

UniqueMergeTree, a new ClickHouse table engine, addresses real‑time data update challenges by combining upsert semantics, unique key enforcement, and efficient delete‑bitmap handling, offering higher query performance at modest write cost, with detailed design, sharding strategies, conflict resolution, and performance evaluation.

ByteDance Data Platform
ByteDance Data Platform
ByteDance Data Platform
How UniqueMergeTree Boosts Real-Time Updates in ClickHouse Column Stores

UniqueMergeTree Development Background

Three typical scenarios require real‑time updates: (1) business needs to analyze transactional data such as orders in real time, requiring data streams to be synchronized to an OLAP database like ClickHouse; (2) real‑time synchronization of tables from a TP database to ClickHouse, needing support for updates and deletions; (3) deduplication of data streams where idempotent writes are required. All scenarios demand second‑ or minute‑level freshness and can be satisfied with a mini‑batch real‑time sync approach.

Common Column‑Store Real‑Time Update Solutions

Key‑Based Merge on Read

This approach is similar to LSM‑Tree. Data are sorted by key and written as column files with version numbers. Reads merge multiple versions to return the latest value for each key. ClickHouse’s ReplacingMergeTree and Doris use this scheme. It simplifies the write path but suffers from poor read performance due to single‑threaded merging, high memory copy cost, and limited predicate push‑down.

Mark‑Delete + Insert

Updates are expressed by marking rows for deletion with a bitmap and inserting new rows. The SQLServer column‑store example shows each RowGroup as an immutable column file with a DeleteBitmap. Queries filter out rows flagged in the bitmap. This method sacrifices write speed because it must locate keys and handle write‑write conflicts.

Variants

Both schemes can be enhanced with auxiliary indexes or buffering strategies to accelerate merges.

UniqueMergeTree Features

UniqueMergeTree introduces a UNIQUE KEY clause to enforce uniqueness. Writes follow upsert semantics: new keys are inserted, existing keys are updated. A virtual delete‑flag column enables real‑time row deletions. A version column resolves back‑fill conflicts, and the engine supports multi‑replica synchronization.

Distributed Table Write: Sharding Options

Two sharding strategies are available:

Internal sharding : ClickHouse’s distributed table automatically routes data based on a sharding key, providing transparent, consistent partitioning across tables. This is used in ByteHouse Cloud Data Warehouse.

External sharding : The client or SDK determines shard placement, reducing the number of small files in real‑time micro‑batches and improving write throughput, but it requires careful coordination by the user.

Single‑Node Read/Write Path

Write path: Determine the target part and row number for the incoming key, update the part’s delete bitmap to mark the old row, and write the new data to a new part. Each part maintains a key index for fast lookup and multiple delete files representing different bitmap versions.

Read path: Load the latest delete‑bitmap snapshots for all parts, then filter out rows marked as deleted during part reads, ensuring uniqueness.

Write‑Merge Conflict Handling

Two conflict types arise:

Write‑write conflict : Concurrent upserts on the same key may both mark the original row for deletion and write new rows, leading to duplicate keys. In AP scenarios, a simple table‑level lock serializes writes.

Write‑merge conflict : Ongoing background merges may see rows deleted by concurrent foreground writes, causing resurrected rows after merge. The solution adds a DeleteBuffer to each merge task, recording keys deleted during the merge. Before committing, the merge task incorporates these keys into the new part’s delete bitmap.

Performance Evaluation

YCSB benchmarks compare UniqueMergeTree with ReplacingMergeTree and the classic MergeTree. UniqueMergeTree’s write throughput drops by 40‑50% relative to ReplacingMergeTree, but query latency improves by an order of magnitude, matching the performance of the standard MergeTree. Gains stem from parallelized merges, in‑memory delete‑bitmap snapshots, direct skip of marked rows, and combined pre‑where and delete filters.

Conclusion and Future Plans

Since its launch in early 2020, UniqueMergeTree has been adopted by over 1,000 tables in production. Key decisions include sacrificing some write performance for substantially better read speed and avoiding strict data‑size limits on indexes. Future work will focus on partial‑column updates and further write‑throughput optimizations, such as finer‑grained table locks and disk‑based key indexes.

ClickHousecolumnar storagereal-time updatesdatabase engineUpsertUniqueMergeTree
ByteDance Data Platform
Written by

ByteDance Data Platform

The ByteDance Data Platform team empowers all ByteDance business lines by lowering data‑application barriers, aiming to build data‑driven intelligent enterprises, enable digital transformation across industries, and create greater social value. Internally it supports most ByteDance units; externally it delivers data‑intelligence products under the Volcano Engine brand to enterprise customers.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.