Unique Engine Design and Implementation in ClickHouse for Bilibili Live Guild Data
Bilibili migrated its live‑guild analytics from MySQL to ClickHouse, creating a custom ReplicatedUniqueMergeTree engine that uses delete‑on‑insert, min‑max and hash‑bucketed indexes with delete bitmaps to achieve 10‑20× faster queries and scalable near‑real‑time reporting despite higher write latency.
This article describes the business background, data requirements, and technical solution for Bilibili's live guild system, focusing on the migration from MySQL to ClickHouse and the design of a custom Unique Engine table engine.
Business background
The live guild platform manages millions of streamers, providing lifecycle management, revenue analysis, viewership data, and monitoring. Guild operators need near‑real‑time data at various granularities (daily, weekly, monthly, T+1 tasks, etc.) and are sensitive to data update frequency.
Data volume estimation
As of December 2023, the system stores daily per‑streamer data at a million‑level scale. Estimated aggregated rows are:
Streamer‑level yearly aggregation: 365 rows
Guild‑level daily: 200,000 rows
Guild‑level weekly: 1,400,000 rows
Guild‑level monthly: 6,000,000 rows
Guild‑level yearly: 73,000,000 rows
MySQL performance bottlenecks
Scanning millions of rows per query on a 4‑core, 8 GB MySQL instance leads to average query times >20 s, far exceeding the SLA. Storage pressure is also high (≈365 M rows for a year), and sharding cannot satisfy flexible OLAP queries.
ClickHouse selection
Given the need for high‑throughput, near‑real‑time analytics, ClickHouse was chosen. Initial implementation used ReplacingMergeTree, but several limitations were identified:
Single‑threaded merge‑on‑read updates cause high latency.
Deduplication key must match the primary key, limiting index flexibility.
Use of the FINAL modifier disables skip‑index and PREWHERE optimizations.
Unique Engine design
To overcome these issues, a new UniqueEngine (implemented as ReplicatedUniqueMergeTree) was introduced. It follows a Delete‑On‑Insert approach, marking old rows for deletion during write rather than merging at read time.
Key components added to each ClickHouse data part:
Unique key min‑max index for fast part pruning.
Unique key hash‑bucketed index for efficient look‑ups.
Delete bitmap to record rows that should be ignored.
During writes, the engine scans historical parts, checks the min‑max and hash‑bucketed indexes, and records matching rows in the delete bitmap. Queries combine the delete bitmap with any existing PREWHERE bit column, achieving both correctness and performance.
Table definition examples
CREATE TABLE bili_live.ads_guild (
`id` Int64,
`uid` Int64,
`guild_id` Int64,
`record_date` String,
`mtime` DateTime,
...
) ENGINE = ReplicatedUniqueMergeTree('/clickhouse/tables/{layer}-{shard}/bili_live/ads_guild', '{replica}', mtime)
PARTITION BY substring(record_date, 1, 6)
ORDER BY (record_date, guild_id)
UNIQUE KEY (record_date, uid)
TTL ...
SETTINGS index_granularity = 8192, storage_policy = 'hot_and_cold', enable_unique_key_bucket = 1, unique_key_deduplicate_level = 1, unique_key_index_type = 1;Handling write‑merge conflicts
Concurrent writes and merges can cause duplicate rows. The solution uses CAS‑style atomic part‑state changes and a commit lock, recording delete marks both at the part level and the table level to guarantee idempotent deduplication.
Performance results
Query latency improved 10‑20×, with p90 latency reduced by 5× and daily scanned data dropping from 60 TB+ to a fraction of that. Write latency for UniqueEngine is higher (≈10×) than ReplacingMergeTree, but still sub‑second for the typical tens of millions of rows per day.
Write‑performance optimizations
Two main bottlenecks were identified: loading the unique‑key index and comparing keys. By persisting the unique‑key index in LevelDB and performing ordered iterator merges, the engine skips irrelevant keys, reducing both I/O and CPU work. This optimization makes write latency roughly linear to the size of the new batch rather than the total historical data.
Future work
Full migration to LevelDB‑backed unique‑key index.
Simplify the MySQL→ClickHouse ingestion pipeline using Flink CDC for a one‑stop sync solution.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
