Operations 14 min read

How to Build Real-Time File Sync and Dual-Node Failover with DRBD and Keepalived

Learn step-by-step how to install and configure DRBD for real-time block device replication, set up keepalived for virtual IP failover, and manage primary/secondary roles, including troubleshooting split-brain, to achieve seamless file synchronization and high availability between two Linux servers.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Build Real-Time File Sync and Dual-Node Failover with DRBD and Keepalived

Installation of DRBD

Prerequisites: a data disk or extra partition. yum update -y Disable firewall:

systemctl stop firewalld
systemctl disable firewalld

Edit /etc/hosts to add primary and secondary IPs.

vim /etc/hosts
192.168.1.240 Primary kylin-01
192.168.1.241 Secondary kylin-02

Disable SELinux:

vim /etc/sysconfig/selinux
SELINUX=disabled

Install dependencies:

yum install gcc libxslt-devel libxslt perl keyutils-libs-devel net-tools -y

Download and compile DRBD

Download source tarballs for drbd and drbd-utils, extract, and compile.

wget https://pkg.linbit.com//downloads/drbd/9/drbd-9.2.8.tar.gz
tar -zxvf drbd-9.2.8.tar.gz
cd drbd-9.2.8
make && make install

wget https://pkg.linbit.com//downloads/drbd/utils/drbd-utils-9.27.0.tar.gz
tar -zxvf drbd-utils-9.27.0.tar.gz
cd drbd-utils-9.27.0
./configure --prefix=/usr/local/drbd --without-83support --with-udev --with-initscripttype=systemd --without-manual
make && make install

Installed binaries are /usr/sbin/drbdsetup, /usr/sbin/drbdmeta, /usr/sbin/drbdadm and configuration directory /usr/local/drbd/etc/drbd.d.

Configure DRBD

Create a partition on the data disk (do not format) and note the device, e.g., /dev/sdb.

fdisk /dev/sdb

Global configuration (global_common.conf)

# DRBD is the result of over a decade of development by LINBIT.
global {
    usage-count yes;
}
common {
    handlers {
        pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
        pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
        local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
    }
    startup {}
    options {}
    disk {
        on-io-error detach;
    }
    net {}
}

Resource configuration (drbd.res)

resource r1 {
    protocol C;
    on kylin-01 {
        device /dev/drbd0;
        disk /dev/sdb1;
        address 192.168.1.240:7789;
        meta-disk internal;
    }
    on kylin-02 {
        device /dev/drbd0;
        disk /dev/sdb1;
        address 192.168.1.241:7789;
        meta-disk internal;
    }
}

Initialize metadata and start DRBD on both nodes:

# drbdadm create-md r1
# systemctl start drbd
# systemctl enable drbd
# systemctl status drbd

Check role with drbdadm role r1. Promote the primary node with drbdadm primary --force r1. Verify synchronization via cat /proc/drbd.

DRBD usage test

Format the DRBD device, mount it, create a test file, then switch primary/secondary roles to confirm data replication.

mkfs.ext4 /dev/drbd0
mkdir /data
mount /dev/drbd0 /data
# on primary node
mkdir /data/test
umount /data
drbdadm secondary r1
drbdadm role r1   # should be Secondary
# on secondary node
drbdadm primary r1
mount /dev/drbd0 /data
ls /data   # test file should appear

Troubleshooting

If umount reports “device is busy”, run fuser -m /data to find the PID and kill it.

Split‑brain situation: when both nodes become Primary after a network outage, DRBD reports “Split‑Brain detected, dropping connection!”. Manual recovery involves disconnecting, discarding data on one side, and reconnecting as described in the original guide.

Installation of keepalived

Install via yum: yum install -y keepalived Configure keepalived with a virtual IP (192.168.1.239) and authentication. The primary node has priority 100, the backup 99. Notification scripts invoke DRBD commands to promote the node and start the PostgreSQL container.

# /etc/keepalived/keepalived.conf (primary)
global_defs {
    router_id kylin-02
    vrrp_skip_check_adv_addr
    vrrp_strict
    vrrp_garp_interval 0
    vrrp_gna_interval 0
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens32
    virtual_router_id 51
    priority 100
    mcast_src_ip 192.168.1.240
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.1.239
    }
    notify_master "/etc/keepalived/notify.sh"
    notify_backup "/etc/keepalived/notify_back.sh"
    notify_stop "/etc/keepalived/notify_back.sh"
}

Backup node configuration is identical except priority 99 and its own mcast_src_ip.

Sample notification script ( notify.sh) promotes DRBD to Primary, mounts the device, and starts the PostgreSQL docker‑compose stack; stop.sh unmounts, stops the container, and demotes DRBD.

#!/bin/bash
drbdadm primary r1
while true; do
    drdbs=$(drbdadm role r1)
    echo "drbd status is $drdbs"
    if [[ "$drdbs" == "Primary" ]]; then
        break
    else
        drbdadm primary r1
        sleep 3
    fi
done
mount /dev/drbd0 /data
docker-compose -f /opt/pgsql/docker-compose.yml up -d

Enable and start keepalived on both nodes:

systemctl start keepalived
systemctl enable keepalived

Validate by creating and deleting data in PostgreSQL, then failing over the virtual IP; the data should remain synchronized.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilityfailoverDRBDfile synchronizationkeepalived
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.