How to Build Real-Time File Sync and Dual-Node Failover with DRBD and Keepalived
Learn step-by-step how to install and configure DRBD for real-time block device replication, set up keepalived for virtual IP failover, and manage primary/secondary roles, including troubleshooting split-brain, to achieve seamless file synchronization and high availability between two Linux servers.
Installation of DRBD
Prerequisites: a data disk or extra partition. yum update -y Disable firewall:
systemctl stop firewalld
systemctl disable firewalldEdit /etc/hosts to add primary and secondary IPs.
vim /etc/hosts
192.168.1.240 Primary kylin-01
192.168.1.241 Secondary kylin-02Disable SELinux:
vim /etc/sysconfig/selinux
SELINUX=disabledInstall dependencies:
yum install gcc libxslt-devel libxslt perl keyutils-libs-devel net-tools -yDownload and compile DRBD
Download source tarballs for drbd and drbd-utils, extract, and compile.
wget https://pkg.linbit.com//downloads/drbd/9/drbd-9.2.8.tar.gz
tar -zxvf drbd-9.2.8.tar.gz
cd drbd-9.2.8
make && make install
wget https://pkg.linbit.com//downloads/drbd/utils/drbd-utils-9.27.0.tar.gz
tar -zxvf drbd-utils-9.27.0.tar.gz
cd drbd-utils-9.27.0
./configure --prefix=/usr/local/drbd --without-83support --with-udev --with-initscripttype=systemd --without-manual
make && make installInstalled binaries are /usr/sbin/drbdsetup, /usr/sbin/drbdmeta, /usr/sbin/drbdadm and configuration directory /usr/local/drbd/etc/drbd.d.
Configure DRBD
Create a partition on the data disk (do not format) and note the device, e.g., /dev/sdb.
fdisk /dev/sdbGlobal configuration (global_common.conf)
# DRBD is the result of over a decade of development by LINBIT.
global {
usage-count yes;
}
common {
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
}
startup {}
options {}
disk {
on-io-error detach;
}
net {}
}Resource configuration (drbd.res)
resource r1 {
protocol C;
on kylin-01 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.1.240:7789;
meta-disk internal;
}
on kylin-02 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.1.241:7789;
meta-disk internal;
}
}Initialize metadata and start DRBD on both nodes:
# drbdadm create-md r1
# systemctl start drbd
# systemctl enable drbd
# systemctl status drbdCheck role with drbdadm role r1. Promote the primary node with drbdadm primary --force r1. Verify synchronization via cat /proc/drbd.
DRBD usage test
Format the DRBD device, mount it, create a test file, then switch primary/secondary roles to confirm data replication.
mkfs.ext4 /dev/drbd0
mkdir /data
mount /dev/drbd0 /data
# on primary node
mkdir /data/test
umount /data
drbdadm secondary r1
drbdadm role r1 # should be Secondary
# on secondary node
drbdadm primary r1
mount /dev/drbd0 /data
ls /data # test file should appearTroubleshooting
If umount reports “device is busy”, run fuser -m /data to find the PID and kill it.
Split‑brain situation: when both nodes become Primary after a network outage, DRBD reports “Split‑Brain detected, dropping connection!”. Manual recovery involves disconnecting, discarding data on one side, and reconnecting as described in the original guide.
Installation of keepalived
Install via yum: yum install -y keepalived Configure keepalived with a virtual IP (192.168.1.239) and authentication. The primary node has priority 100, the backup 99. Notification scripts invoke DRBD commands to promote the node and start the PostgreSQL container.
# /etc/keepalived/keepalived.conf (primary)
global_defs {
router_id kylin-02
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
virtual_router_id 51
priority 100
mcast_src_ip 192.168.1.240
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.239
}
notify_master "/etc/keepalived/notify.sh"
notify_backup "/etc/keepalived/notify_back.sh"
notify_stop "/etc/keepalived/notify_back.sh"
}Backup node configuration is identical except priority 99 and its own mcast_src_ip.
Sample notification script ( notify.sh) promotes DRBD to Primary, mounts the device, and starts the PostgreSQL docker‑compose stack; stop.sh unmounts, stops the container, and demotes DRBD.
#!/bin/bash
drbdadm primary r1
while true; do
drdbs=$(drbdadm role r1)
echo "drbd status is $drdbs"
if [[ "$drdbs" == "Primary" ]]; then
break
else
drbdadm primary r1
sleep 3
fi
done
mount /dev/drbd0 /data
docker-compose -f /opt/pgsql/docker-compose.yml up -dEnable and start keepalived on both nodes:
systemctl start keepalived
systemctl enable keepalivedValidate by creating and deleting data in PostgreSQL, then failing over the virtual IP; the data should remain synchronized.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
