Operations 10 min read

How to Install Linux Watchdog and Integrate It with Spring Boot for Automatic Recovery

Learn why a Linux watchdog is essential for Spring Boot services, how to install and configure the watchdog daemon on various Linux distributions, load the appropriate kernel driver, set up systemd, and integrate a Java-based feeding task to ensure automatic recovery from hangs or crashes.

Java Tech Enthusiast
Java Tech Enthusiast
Java Tech Enthusiast
How to Install Linux Watchdog and Integrate It with Spring Boot for Automatic Recovery

1. Why a Watchdog is Needed?

Many Spring Boot applications encounter issues in production where they appear alive but are actually unusable.

Appears healthy but unusable: /actuator/health still returns 200 while thread pools or GC are completely stuck.

Monitoring delay: Prometheus only catches the anomaly after several minutes.

High manual intervention cost: Only SSH can be used, e.g., kill -9 or restart the service.

Solution Idea

Application-level self‑check: Actuator, thread‑pool monitoring, self‑healing scripts.

System-level fallback: Linux Watchdog.

This article focuses on how to install and configure the Linux watchdog and integrate it with Spring Boot.

2. Introduction to Linux Watchdog

Linux Watchdog is essentially a kernel driver that exposes the device /dev/watchdog to user space.

Requires a user‑space process to periodically "feed" the watchdog by writing data.

If not fed within the configured timeout, the watchdog triggers actions such as system reboot, power‑off, or NMI.

Acts as a final fallback mechanism to prevent application dead‑locks.

3. Installing and Configuring Watchdog

3.1 Install the watchdog service

On common Linux distributions, watchdog is not installed by default and must be installed manually.

Debian/Ubuntu:

sudo apt-get update
sudo apt-get install watchdog

CentOS/RHEL/UOS: sudo yum install watchdog After installation you will have:

Configuration file: /etc/watchdog.conf systemd service:

watchdog.service

3.2 Load the kernel driver

Common watchdog drivers include:

Intel platform: iTCO_wdt AMD platform: sp5100_tco Virtualized environments (QEMU/KVM): softdog Check support:

ls /lib/modules/$(uname -r)/kernel/drivers/watchdog/

Load a driver, for example:

sudo modprobe iTCO_wdt    # Intel chip
sudo modprobe softdog soft_panic=1  # Software watchdog if hardware is absent

After loading, you should see the device file:

ls /dev/watchdog*
# Output example:
# /dev/watchdog  /dev/watchdog0

3.3 Configure the watchdog service

Edit /etc/watchdog.conf and set core options:

watchdog-device = /dev/watchdog
interval = 5
max-load-1 = 24
min-memory = 1
realtime = yes
priority = 1

Explanation: watchdog-device – selects which watchdog device to use. interval – heartbeat interval in seconds. max-load-1 – triggers reboot when load is too high. min-memory – triggers reboot when available memory falls below the threshold.

3.4 Start and verify the service

sudo systemctl enable watchdog
sudo systemctl start watchdog
sudo systemctl status watchdog

Verify that the watchdog is running:

dmesg | grep watchdog
# or
journalctl -u watchdog

4. Integrating Watchdog with Spring Boot

With the system watchdog running, we let Spring Boot interact with it to implement application‑level feeding logic.

4.1 Watchdog feeding task (Java)

import org.springframework.boot.CommandLineRunner;
import org.springframework.stereotype.Component;

import java.io.FileOutputStream;
import java.io.IOException;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

@Component
public class WatchdogService implements CommandLineRunner {

    private FileOutputStream watchdog;

    @Override
    public void run(String... args) {
        // Execute shell commands to install softdog and start the watchdog service
        String[] commands = {
                "modprobe -r softdog",
                "modprobe softdog soft_panic=1",
                "systemctl stop watchdog",
                "systemctl start watchdog"
        };
        for (String cmd : commands) {
            try {
                Process process = new ProcessBuilder("/bin/sh", "-c", cmd)
                        .inheritIO()
                        .start();
                int exitCode = process.waitFor();
                if (exitCode != 0) {
                    System.err.println("Command failed: " + cmd);
                }
            } catch (IOException | InterruptedException e) {
                System.err.println("Error executing command: " + cmd + ", cause: " + e.getMessage());
            }
        }

        // Open /dev/watchdog for feeding
        try {
            watchdog = new FileOutputStream("/dev/watchdog");
        } catch (IOException e) {
            System.err.println("Cannot open /dev/watchdog, check driver and permissions");
            return;
        }

        Executors.newSingleThreadScheduledExecutor()
                .scheduleAtFixedRate(this::feedDog, 0, 5, TimeUnit.SECONDS);

        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            try {
                if (watchdog != null) {
                    watchdog.close(); // prevent accidental trigger
                }
            } catch (IOException ignored) {}
        }));
    }

    private void feedDog() {
        try {
            watchdog.write("1".getBytes());
            watchdog.flush();
        } catch (IOException e) {
            System.err.println("Feeding watchdog failed: " + e.getMessage());
        }
    }
}

Work mechanism:

Feeds the watchdog every 5 seconds.

If the JVM hangs (GC, deadlock, thread block), feeding fails and the system watchdog forces a reboot.

During graceful shutdown, the watchdog is closed to avoid false alarms.

5. Real‑World Scenarios

Thread‑pool deadlock: Scheduled tasks stop → watchdog timeout → system reboot.

Full GC pause: JVM hangs → watchdog not fed → automatic recovery.

Process deadlock: All threads stuck → watchdog provides fallback protection.

6. Best Practices: Dual‑Layer Protection

Application‑level self‑check:

Use Spring Boot Actuator to monitor thread pools, GC, response times.

When anomalies are detected, attempt self‑healing (release locks, degrade, restart thread pools).

System‑level watchdog: Acts as a safety net ensuring the system restarts even if the application is completely frozen.

This dual‑insurance model is especially suitable for unattended or remote‑operation scenarios such as edge devices, IoT gateways, and financial terminals.

7. Summary

1. Linux Watchdog is not enabled by default; you need to install the package, load the driver, and configure the systemd service.

2. Spring Boot can easily integrate a watchdog feeding mechanism.

3. Combining watchdog with application‑level monitoring builds a highly resilient, self‑healing high‑availability system.

Linux Watchdog diagram
Linux Watchdog diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

OperationsLinuxsystem reliabilityWatchdog
Java Tech Enthusiast
Written by

Java Tech Enthusiast

Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.