Big Data 14 min read

Bilibili's Iceberg‑Based Lakehouse Platform: Technical Practices for Sub‑Second Query Response

This article details Bilibili's implementation of an Iceberg‑based lakehouse platform that unifies storage and analytics, addressing Hive’s performance and latency issues through multidimensional sorting, various file‑level indexes, cube pre‑aggregation, star‑tree structures, and an automated Magnus service for intelligent optimization, achieving near‑second query responses.

DataFunSummit
DataFunSummit
DataFunSummit
Bilibili's Iceberg‑Based Lakehouse Platform: Technical Practices for Sub‑Second Query Response

To overcome Hive's shortcomings—insufficient interactive query performance, complex data pipelines, data silos, and hour‑level latency—Bilibili built a lakehouse platform on Apache Iceberg, aiming for interoperable, high‑performance, and user‑friendly analytics.

The architecture stores Iceberg tables on HDFS, with data ingestion via FIink, Spark, or a Java API; a service called Magnus continuously optimizes Iceberg metadata, while Alluxio provides caching and Trino serves as the interactive query engine. Some workloads still fall back to ClickHouse or Elasticsearch for millisecond‑level latency.

Iceberg manages file‑level metadata using snapshots and manifests, offering an open storage format that facilitates future extensions.

For query acceleration, the team applies multidimensional sorting (preferring the Hibert Curve over Z‑ORDER) and a Boundary Index to handle non‑integer columns. They limit sorting fields to four to maintain clustering effectiveness.

Supported file‑level indexes include BloomFilter (simple, space‑efficient, equality only), Bitmap (supports equality and range, larger footprint), BloomRF (multi‑segment hash, similar to BloomFilter), and token‑based indexes (TokenBloomFilter, TokenBitmap, Ngram variants) for log data.

Pre‑aggregation is realized through Cubes (or AggIndex) that store aggregated results at the file level, supporting both single‑table and star‑schema queries. When Cube data is incomplete, the engine merges existing Cubes with on‑the‑fly aggregation. A star‑tree index is built on top of Cubes to balance storage cost and query coverage.

The Magnus service automates backend optimizations: it listens to Iceberg write events and triggers Spark jobs for sorting, indexing, and Cube generation; it visualizes table metadata for troubleshooting; and it analyzes Trino query logs to recommend schema or index adjustments.

In production, the platform manages roughly 5 PB of Iceberg tables with a daily growth of 75 TB, handles about 200 k Trino queries per day, and achieves a P95 response time of 5 seconds, targeting sub‑second to 10‑second latency for most workloads.

The presentation concludes with an invitation to follow Bilibili's technical community for further updates.

big dataOLAPicebergQuery Accelerationlakehousedata indexing
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.