Big Data 12 min read

Practical Experience of Using Flink + Iceberg 0.11 on Qunar Data Platform

This article presents Qunar's practical experience with Flink and Iceberg 0.11, covering background challenges such as Kafka data loss and Hive metadata pressure, explaining Iceberg architecture, query planning, and detailed solutions including real‑time ingestion, small‑file handling, sorting, and code examples for seamless migration.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Practical Experience of Using Flink + Iceberg 0.11 on Qunar Data Platform

Qunar's data platform faced several challenges when using Flink for real‑time data warehousing, including frequent Kafka data loss and increasing pressure on Hive metadata caused by near‑real‑time reads.

The original architecture stored real‑time streams in Kafka and processed them with Flink SQL or DataStream jobs, while less time‑critical data was written to Hive partitions. Over time, the growing number of Hive partitions and metadata caused query planning to slow down and put additional load on the Hive metastore database.

Iceberg 0.11 was introduced to address these issues. Iceberg stores data files (typically Parquet) in a distributed file system and maintains manifest files that describe each data file’s location, partition, and column statistics. Snapshots represent the table state at a specific point in time, linking to sets of manifest files.

During query planning, Iceberg leverages metadata filtering: predicates are pushed down to partition data and column statistics, allowing the engine to prune irrelevant files early. Snapshot IDs map to groups of manifest files, and manifest files contain min/max statistics for each column, enabling efficient file selection without scanning Hive metadata.

Pain point 1 – Kafka data loss: By moving less‑time‑critical data into Iceberg, which supports near‑real‑time reads and retains historical data, the pressure on Kafka is reduced and data loss is avoided.

Pain point 2 – Flink + Hive near‑real‑time slowdown: Migrating the workload from Hive to Iceberg eliminates the centralised metastore bottleneck because Iceberg stores metadata in a scalable distributed file system, handling large numbers of partitions more efficiently.

Optimization practices: Iceberg 0.11 adds streaming small‑file merging and sorting capabilities. Small files can be merged in‑stream using the hash distribution mode, and sorting improves scan performance by allowing predicate push‑down on ordered columns.

Example code for catalog creation:

CREATE CATALOG Iceberg_catalog WITH (
  'type'='Iceberg',
  'catalog-type'='Hive',
  'uri'='thrift://localhost:9083'
);

Inserting data from Kafka into Iceberg:

INSERT INTO Iceberg_catalog.Iceberg_db.tbl1 SELECT * FROM Kafka_tbl;

Streaming write with options:

INSERT INTO Iceberg_catalog.Iceberg_db.tbl2 SELECT * FROM Iceberg_catalog.Iceberg_db.tbl1 /*+ OPTIONS('streaming'='true', 'monitor-interval'='10s', snapshot-id='3821550127947089987') */;

Small‑file rewrite example:

Table table = findTable(options, conf);
Actions.forTable(table).rewriteDataFiles()
    .targetSizeInBytes(10 * 1024) // 10KB
    .execute();

Creating a table with hash distribution for real‑time merging:

CREATE TABLE city_table (
    province BIGINT,
    city STRING
) PARTITIONED BY (province, city) WITH (
    'write.distribution-mode'='hash'
);

Sorting demo:

INSERT INTO Iceberg_table SELECT days FROM Kafka_tbl ORDER BY days, province_id;

Images illustrating the Iceberg architecture, query plan, and performance improvements are included throughout the article.

In summary, adopting Iceberg 0.11 on top of Flink resolves Kafka data loss, reduces Hive metadata pressure, enables real‑time ingestion with small‑file handling and sorting, and provides a more scalable and efficient data lake solution for Qunar's streaming workloads.

real-time processingFlinkSQLIceberg
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.