Understanding the Chandy‑Lamport Distributed Snapshot Algorithm for Fault‑Tolerant Stream Processing
This article explains the Chandy‑Lamport distributed snapshot algorithm, its role in creating consistent global states for fault‑tolerant stream processing systems such as Spark Structured Streaming and Flink, detailing the algorithm’s workflow, examples, and its relevance to modern data‑flow architectures.
The Chandy‑Lamport algorithm provides a method to obtain a consistent snapshot of the global state of a distributed system, which is essential when there is no global clock or shared memory.
The algorithm was introduced in 1985 by K. Chandy and L. Lamport in the paper “Distributed Snapshots: Determining Global States of a Distributed System,” building on Lamport’s earlier work on time, clocks, and event ordering.
In a distributed system modeled as a set of processes connected by FIFO channels, the global snapshot consists of each process’s local state together with the contents of all channels; recording this state enables failure recovery and other fault‑tolerance mechanisms.
The algorithm proceeds in three phases: (1) Initiating a snapshot – any process records its local state and sends a special marker on all outgoing channels; (2) Propagating the snapshot – when a process receives a marker for the first time it records its state, empties the incoming channel, and forwards the marker, while later markers cause the process to record any messages that arrived before the marker; (3) Terminating the snapshot – the snapshot completes when every process has recorded its state and the state of its channels.
An illustrative example with two processes (P1 and P2) shows how markers and ordinary messages are interleaved, how each process records the appropriate messages before and after the marker, and how the combined local snapshots form a consistent global snapshot (diagrams omitted).
The Chandy‑Lamport algorithm is simple yet powerful for fault‑tolerant distributed computations, assuming reliable, ordered channels. Spark Structured Streaming employs a variant of this algorithm for failover, while Flink uses a barrier‑based approach described in the 2015 paper “Lightweight asynchronous snapshots for distributed dataflows,” which provides exactly‑once semantics and is more suitable for engineering implementations.
References:
Chandy K M, Lamport L. Distributed snapshots: Determining global states of distributed systems. ACM Transactions on Computer Systems, 1985.
Carbone P, Fóra G, Ewen S, et al. Lightweight asynchronous snapshots for distributed dataflows. arXiv preprint arXiv:1506.08603, 2015.
Lamport, “Time, Clocks and the Ordering of Events in a Distributed System.”
Leslie Lamport Homepage.
Various lecture notes and PDFs on distributed snapshots.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
