Big Data 14 min read

Understanding Flink Checkpoint and Unaligned Checkpoint Mechanisms

This article explains Flink's fundamental checkpoint mechanism, its coupling with backpressure, and how the introduction of Unaligned Checkpoint in Flink 1.11 decouples checkpointing from backpressure to improve latency and resource utilization in high‑backpressure streaming jobs.

Big Data Technology & Architecture

Apr 19, 2022

Understanding Flink Checkpoint and Unaligned Checkpoint Mechanisms

As the most basic and critical fault‑tolerance mechanism in Flink, the checkpoint snapshot mechanism ensures data accuracy when a Flink application recovers from an abnormal state.

Checkpoint‑related metrics are also the most important indicators for diagnosing the health of a Flink job; successful and short‑duration checkpoints indicate good job health without anomalies or backpressure.

However, because checkpointing is coupled with backpressure, backpressure can adversely affect checkpoints, leading to various checkpoint problems.

To address this, Flink 1.11 introduced Unaligned Checkpoint to decouple the checkpoint mechanism from backpressure and improve checkpoint performance under high backpressure.

Current Checkpoint Mechanism Overview

Many readers are already familiar with Flink checkpoints based on the Chandy‑Lamport distributed snapshot algorithm; this section briefly reviews the algorithm’s basic logic.

The Chandy‑Lamport algorithm abstracts a distributed system as a DAG (ignoring cycles), where nodes represent processes and edges represent communication channels. The goal of a distributed snapshot is to record the entire system state, which includes both process states and channel states (in‑flight data). By dividing the input message stream into short sub‑sequences, each node or channel can take a local snapshot at sub‑sequence boundaries, and the collection of these local snapshots forms a consistent global snapshot.

In Flink, a special element called a Barrier is periodically injected into the data stream at the source, dividing the continuous stream into finite sequences that correspond to checkpoint intervals. When a Barrier is received, operators perform a local checkpoint, asynchronously upload the snapshot, and broadcast the Barrier downstream. When all Barriers for a checkpoint reach the end of the DAG and all operators have completed their snapshots, the global snapshot is considered successful.

When multiple input channels exist, operators wait for Barriers from all streams before starting a local snapshot (Barrier alignment). During alignment, operators continue processing data from channels that have not yet received a Barrier, while data from channels that have received a Barrier is buffered until the buffer fills and blocks.

The algorithm’s advantage is that, with Copy‑On‑Write, it avoids a "Stop‑The‑World" pause and requires minimal persistent storage, greatly reducing snapshot size.

Coupling of Checkpoint and Backpressure

In most cases the current checkpoint algorithm works well, but when a job experiences backpressure, the blocking Barrier alignment can exacerbate backpressure and destabilize the job.

First, the completion of the Chandy‑Lamport snapshot depends on the flow of markers; backpressure limits marker flow, extending checkpoint completion time or causing timeouts. This delays checkpoint timestamps relative to actual data, leaving the job’s progress unpersisted and vulnerable to loss if the job is restarted.

Second, Barrier alignment itself can become a source of backpressure, affecting upstream operators unnecessarily, especially when multiple sources share common downstream operators.

For example, if a job aggregates metrics for two business lines A and B and also aggregates weekly metrics, a spike in B can delay the checkpoint barrier, causing the shared window aggregate to wait, which in turn blocks A’s processing.

Although splitting the job can mitigate this, it adds development and maintenance overhead.

Unaligned Checkpoint

To solve the above issue, Flink 1.11 introduced Unaligned Checkpoint. Understanding its principle requires revisiting the marker handling rule in the original Chandy‑Lamport paper.

In Aligned Checkpoint, the condition "if q has not recorded its state" is always true because the snapshot is delayed until all Barriers arrive, avoiding the need to snapshot the operator’s input queue but at the cost of uncontrolled checkpoint duration and reduced throughput.

Unaligned Checkpoint changes the trigger to the first Barrier and removes the blocking of channels, aligning more closely with the original Chandy‑Lamport algorithm while adding Flink‑specific improvements.

Key differences:

When the snapshot is triggered: at the first Barrier vs. the last Barrier.

Whether channels that have received a Barrier are blocked.

Consequently, Unaligned Checkpoint records the state as soon as the first Barrier arrives and allows channels to continue processing, reducing overall checkpoint latency.

However, this approach has known drawbacks:

State size grows because buffered data must be persisted, increasing disk load.

Larger state size can lengthen job recovery time and raise operational difficulty.

Unaligned Checkpoint is best suited for complex jobs with high backpressure; simpler ETL jobs may still prefer Aligned Checkpoint.

Summary

Flink 1.11's Unaligned Checkpoint primarily solves the difficulty of completing checkpoints under high backpressure by sacrificing disk resources to avoid blocking, thereby improving resource utilization. As stream processing becomes more prevalent, Unaligned Checkpoint may eventually become the default checkpoint strategy in Flink.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Flink Checkpoint Unaligned Checkpoint backpressure

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.