Big Data 9 min read

Two-Phase Commit (2PC) in Flink: Mechanism, Implementation, and Kafka Integration

This article explains the fundamentals of the two‑phase commit protocol, details its two stages (prepare and commit), discusses its advantages and drawbacks, and shows how Apache Flink implements 2PC for exactly‑once semantics with Kafka using the TwoPhaseCommitSinkFunction and related code examples.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Two-Phase Commit (2PC) in Flink: Mechanism, Implementation, and Kafka Integration

Two‑phase commit (2PC) is a basic distributed consistency protocol widely used in systems that require atomic transactions across multiple nodes. The article first introduces the coordinator and participant roles, then describes the two stages of 2PC: the prepare (voting) phase and the commit (execution) phase.

Prepare (Voting) Phase

Coordinator sends a prepare request with transaction details to all participants.

Each participant executes the operations, writes undo and redo logs, but does not commit.

Participants reply with yes if successful, otherwise no.

Commit (Execution) Phase

If all participants return yes, the coordinator sends a commit request, participants finalize the transaction, release resources, and acknowledge. If any participant returns no or times out, the coordinator sends a rollback request, and participants revert using the undo log.

The article also lists 2PC’s pros (simple principle, easy to implement) and cons (single‑point coordinator, synchronous blocking, residual inconsistency risk).

Flink’s Transactional Write Using 2PC

Flink leverages 2PC to provide exactly‑once guarantees for sinks. The abstract class TwoPhaseCommitSinkFunction defines four abstract methods that concrete sinks must implement:

protected abstract TXN beginTransaction() throws Exception;
protected abstract void preCommit(TXN transaction) throws Exception;
protected abstract void commit(TXN transaction);
protected abstract void abort(TXN transaction);

Flink’s FlinkKafkaProducer011 implements these methods to interact with Kafka’s transactional API (Kafka 0.11+). The beginTransaction() method creates a transactional producer, preCommit() flushes pending records, commit() calls producer.commitTransaction(), and abort() calls producer.abortTransaction().

During a checkpoint, Flink’s snapshotState() method acts as the pre‑commit step, storing the transaction handle. When the checkpoint completes, notifyCheckpointComplete() iterates over pending transactions, verifies checkpoint IDs, and invokes commit(). If any commit fails, the first exception is re‑thrown; otherwise, the transaction is considered successfully committed.

Images in the original article illustrate the successful and failed commit flows, as well as the integration diagram between Flink and Kafka.

Overall, the article provides a comprehensive walkthrough of 2PC theory, its practical trade‑offs, and a concrete implementation within Flink for reliable, exactly‑once streaming writes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsFlinkStreamingKafkatwo-phase commit
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.