Big Data 4 min read

Understanding Flink Restart Strategies: Configuration and Code Examples

This article explains Flink's restart strategies—including fixed‑delay, failure‑rate, and no‑restart—how to configure them globally via flink‑conf.yaml or programmatically in code, and provides complete Java examples demonstrating each approach.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Understanding Flink Restart Strategies: Configuration and Code Examples

Overview : Flink supports multiple restart strategies that determine how a job is restarted after a failure. A default strategy is defined in flink-conf.yaml and is used when no specific strategy is set for a job.

Restart Strategy Types :

Fixed‑delay: restarts a job a fixed number of times with a constant delay between attempts.

Failure‑rate: restarts a job up to a maximum number of failures within a time interval; if the failure rate exceeds the limit, the job is considered failed.

No‑restart: disables automatic restarts.

If checkpointing is disabled, Flink uses the no‑restart strategy; if checkpointing is enabled but no strategy is configured, the fixed‑delay strategy is applied.

Configuration via flink-conf.yaml :

restart-strategy: fixed-delay
restart-strategy.fixed-delay.attempts: 3
restart-strategy.fixed-delay.delay: 10 s

or for failure‑rate:

restart-strategy: failure-rate
restart-strategy.failure-rate.max-failures-per-interval: 3
restart-strategy.failure-rate.failure-rate-interval: 5 min
restart-strategy.failure-rate.delay: 10 s

or to disable restarts: restart-strategy: none Programmatic Configuration (Java) :

env.setRestartStrategy(RestartStrategies.fixedDelayRestart(3, Time.seconds(10)));

env.setRestartStrategy(RestartStrategies.failureRateRestart(
    3, Time.of(5, TimeUnit.MINUTES), Time.of(10, TimeUnit.SECONDS)));

env.setRestartStrategy(RestartStrategies.noRestart());

Full Example :

public class RestartTest {
    public static void main(String[] args) {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // Enable checkpointing every 1000 ms
        env.enableCheckpointing(1000);
        // Fixed‑delay: 3 attempts, 10‑second interval
        env.setRestartStrategy(RestartStrategies.fixedDelayRestart(3, Time.seconds(10)));
        // Failure‑rate: max 3 failures in 5 minutes, 10‑second interval
        env.setRestartStrategy(RestartStrategies.failureRateRestart(
            3, Time.of(5, TimeUnit.MINUTES), Time.of(10, TimeUnit.SECONDS)));
        // No restart
        env.setRestartStrategy(RestartStrategies.noRestart());
    }
}

The article also provides links to a series of Flink tutorials and points readers to the original GitHub repository for the full source.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaBig DataFlinkConfigurationRestart Strategy
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.