Understanding Flink's Asynchronous Barrier Snapshotting (ABS) Checkpoint Algorithm
This article explains the Asynchronous Barrier Snapshotting algorithm used by Apache Flink for checkpointing, detailing its origins from the Chandy‑Lamport algorithm, its operation in both acyclic and cyclic dataflow graphs, barrier alignment, and the fault‑recovery process.
Introduction
The article revisits the Chandy‑Lamport distributed snapshot algorithm and introduces the Asynchronous Barrier Snapshotting (ABS) algorithm that Flink uses for checkpointing, highlighting its inspiration from Chandy‑Lamport and the optimizations it adds.
Asynchronous Barrier Snapshotting
ABS originates from the paper "Lightweight Asynchronous Snapshots for Distributed Dataflows" and provides two main contributions: (1) an asynchronous snapshot algorithm that achieves minimal state snapshots on acyclic graphs, and (2) an implementation for cyclic execution graphs.
Stream Processing System Model
Flink jobs are represented as a directed acyclic graph (DAG) where nodes are operators (Source, Transformation, Sink) and edges are data streams, mirroring the process‑channel model of Chandy‑Lamport.
ABS Execution in Acyclic Graphs
Assumptions: channels are reliable, FIFO, can be blocked/unblocked; barrier messages are injected at sources and broadcast downstream; barriers are written to a special Nil input channel.
The central coordinator periodically injects a Barrier into all Source operators; each Source snapshots its state and broadcasts the Barrier downstream.
Downstream operators block the input channel that received the Barrier, wait for Barriers on all other inputs, then snapshot their state, broadcast the Barrier further, and finally unblock the channels.
When all Sink operators have completed their snapshots, the global checkpoint is finished.
Key differences from Chandy‑Lamport: ABS does not record channel state because a node’s snapshot only occurs after all its input and output data for the current checkpoint have been processed, reducing snapshot size and latency.
Barrier alignment ensures exactly‑once semantics by guaranteeing that only data belonging to the current checkpoint interval is processed; disabling alignment yields at‑least‑once semantics but may cause duplicate processing after recovery.
The "asynchronous" aspect means that after a Barrier is injected, operators continue processing downstream data while the snapshot is being persisted and acknowledgments are sent.
ABS Execution in Cyclic Graphs
When the execution graph contains back‑edges, the algorithm proceeds as follows:
After receiving Barriers on all non‑back‑edge inputs, a node locally saves its state.
It then records data arriving from back‑edges until a Barrier is received on those edges.
The final snapshot consists of the saved state plus the recorded back‑edge data.
Failure Recovery
During recovery, each task restores its state from the persisted snapshot, replays any logged records, and resumes processing from its input channels.
Conclusion
The ABS algorithm enables Flink to provide exactly‑once processing guarantees with efficient, low‑overhead checkpoints, handling both acyclic and cyclic dataflows.
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.