Big Data 15 min read

Achieving Exactly-Once Writes from Flink to ClickHouse: Architecture and Performance

This article explains how Flink and ClickHouse can be combined to build a real-time data warehouse with end-to-end Exactly-Once guarantees, detailing the underlying write mechanisms, transaction state machine, connector implementation, and performance test results, while also outlining future enhancements for distributed transactions.

Alibaba Cloud Developer

Nov 22, 2021

Achieving Exactly-Once Writes from Flink to ClickHouse: Architecture and Performance

Background

Flink and ClickHouse are leaders in real-time stream processing and OLAP respectively. Many internet, advertising, and gaming customers combine them to build user profiling, real-time BI reports, monitoring, etc., forming a real-time data warehouse that requires end-to-end Exactly-Once guarantees.

Mechanism Overview

1 ClickHouse Write Mechanism

ClickHouse is an MPP columnar OLAP system. Data is stored in the smallest unit called a data part. Incoming data blocks are split by partition into one or more data parts, which are later merged into larger parts to reduce storage and read overhead.

When writing to a local table, ClickHouse first creates a temporary data part that is invisible to clients, then renames it to a permanent data part, making the data visible. Unrenamed temporary parts are eventually cleaned up.

This temporary-to-permanent workflow can be adapted to a two-phase commit protocol to provide transactional consistency.

2 Flink Write Mechanism

Flink provides a transactional sink that guarantees Exactly-Once writes using XA-compliant JDBC. The checkpoint mechanism periodically snapshots operator state. A coordinator triggers checkpoints and notifies operators when all snapshots complete.

Operator execution phases: initial, writeData, snapshot, commit, close.

initial phase

Retrieve the last persisted xid from the snapshot.

Rollback unfinished snapshots and retry commits for completed snapshots.

If any operation fails, the task aborts and enters the close phase.

Create a new unique xid, record it in the snapshot, and call the JDBC start() interface.

writeData phase

Operator writes data in batches using preparedStatement addBatch() and executeBatch(), attaching the current xid.

Data is flushed to ClickHouse when batch size is reached, a periodic timer triggers, or before snapshot.

snapshot phase

Call end() and prepare() to wait for commit and update snapshot state.

Start a new transaction for the next xid and persist the snapshot.

complete phase

After all operators finish their snapshots, the coordinator triggers commit() on each operator’s JDBC connection.

close phase

If the transaction has not reached the snapshot stage, rollback.

Release all resources.

The checkpoint and transaction mechanisms ensure that each batch is fully written before commit; failures cause rollback to the last successful checkpoint.

Technical Solution

Overall Design

The transaction flow from Flink to ClickHouse is illustrated in a sequence diagram (Fig‑3). Because writes target ClickHouse local tables, only a subset of two-phase commit interfaces need to be implemented on the ClickHouse side.

ClickHouse Server

State Machine

Begin – start a transaction.

Write Data – write within a transaction.

Commit – commit the transaction.

Rollback – rollback an uncommitted transaction.

Transaction states: Unknown, Initialized, Committing, Committed, Aborting, Aborted. All operations are idempotent.

Transaction Processing

Clients use HTTP RESTful API to interact with ClickHouse. A typical transaction involves Begin, Write (temporary data part), and Commit, with the server converting temporary parts to permanent ones upon commit. The current implementation supports only single-node transactions; distributed transactions are not yet supported.

ClickHouse-JDBC

An XAResource implementation is provided via an adapter pattern, exposing only the necessary XA interfaces to Flink while hiding unsupported ones. XADataSource is also adapted, with retry parameters tuned for Exactly-Once.

Flink-Connector-ClickHouse

The connector adds XADataSource configuration options to enable Exactly-Once semantics, and supports load-balancing by configuring multiple IPs and a random write mode.

Test Results

1 ClickHouse Transaction Performance

Throughput scales with client concurrency. Enabling transactions improves write performance because temporary data parts are not immediately merged, reducing disk I/O. However, more batches increase CPU pressure during merges, slightly degrading performance.

2 Flink Writing ClickHouse Performance

Checkpoint interval has little impact when Exactly-Once is disabled. With Exactly-Once enabled, shorter intervals increase transaction overhead, while longer intervals delay the final commit, resulting in a U-shaped performance curve.

Future Plans

The current EMR ClickHouse transaction support is limited to single-node transactions. Future work includes designing a ClickHouse MetaServer to enable distributed transactions and remove the dependency on ZooKeeper.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Flink transaction Performance Testing ClickHouse

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.