Operations 26 min read

Mastering Keepalived: Complete Guide to High‑Availability Load Balancing

This tutorial explains Keepalived’s VRRP‑based failover, IPVS rule generation, health‑checking, script integration, installation methods, detailed configuration files, notification handling, logging, brain‑split prevention, and VRRP scripting for building robust high‑availability clusters on Linux.

Raymond Ops
Raymond Ops
Raymond Ops
Mastering Keepalived: Complete Guide to High‑Availability Load Balancing

Keepalived

1. Introduction

Keepalived provides VRRP‑based address failover, generates IPVS rules for virtual IPs, performs health checks on real servers, and can invoke scripts to affect services such as Nginx or HAProxy.

2. Architecture

Core user‑space components include:

vrrp stack – VIP announcement

checkers – monitor real servers

system call – invoke scripts on state change

SMTP – email alerts

IPVS wrapper – generate ipvsadm rules

Netlink reflector – move virtual IPs

WatchDog – monitor the whole process

Control component – parses keepalived.conf

IO multiplexer – optimized thread abstraction

Memory manager – generic allocation functions

3. Installation

yum install keepalived -y

3.1 Compile from source

yum install gcc curl openssl-devel libnl3-devel net-snmp-devel -y
wget https://keepalived.org/software/keepalived-2.2.2.tar.gz
tar xf keepalived-2.2.2.tar.gz
cd keepalived-2.2.2
./configure --prefix=/usr/local/keepalived
make && make install
mkdir /etc/keepalived
cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf
sed -i 's/eth0/ens33/' /etc/keepalived/keepalived.conf

4. Keepalived configuration files

Package name: keepalived

Main binary: /usr/sbin/keepalived

Main config: /etc/keepalived/keepalived.conf

Example configs: /usr/share/doc/keepalived/

Systemd unit: /lib/systemd/system/keepalived.service

Sysconfig: /etc/sysconfig/keepalived (CentOS)

4.1 Global configuration

global_defs {
    notification_email {
        root@localhost
        root@localhost
        [email protected]
    }
    notification_email_from keepalived@localhost
    smtp_server 127.0.0.1
    smtp_connect_timeout 30
    router_id LVS01
    vrrp_skip_check_adv_addr
    vrrp_strict
    vrrp_garp_interval 0
    vrrp_gna_interval 0
    vrrp_mcast_group4 225.0.0.18
    vrrp_iptables
}

4.2 VRRP instance configuration

vrrp_instance VI_1 {
    state MASTER|BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass <PASSWORD>
    }
    virtual_ipaddress {
        192.168.200.100
        192.168.200.101/24 dev eth1
        192.168.200.102/24 dev eth2 label eth2:1
    }
    track_interface {
        eth0
        eth1
    }
}

4.3 Notification scripts

Scripts can be called on state transitions:

notify_master   "/opt/keepalive.sh master"
notify_backup   "/opt/keepalive.sh backup"
notify_fault    "/opt/keepalive.sh fault"
notify_stop     "/opt/keepalive.sh stop"

4.4 Logging

Enable dedicated logging via rsyslog:

local6.*    /var/log/keepalived.log

5 Practical operation – LVS + Keepalived HA cluster

Typical environment:

Master keepalived: 192.168.91.100 (LVS)
Backup keepalived: 192.168.91.101 (LVS)
Web1: 192.168.91.102
Web2: 192.168.91.103
VIP: 192.168.91.188

Key steps include disabling firewalld, installing ipvsadm, configuring sysctl, creating keepalived.conf with appropriate global_defs, vrrp_instance, and virtual_server sections, and synchronising the config to the backup node.

5.1 Example virtual_server block

virtual_server 192.168.91.188 80 {
    delay_loop 6
    lb_algo rr
    lb_kind DR
    persistence_timeout 0
    protocol TCP
    real_server 192.168.91.103 80 {
        weight 1
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
    real_server 192.168.91.105 80 {
        weight 1
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
}

6. Brain‑split (split‑brain) explanation and mitigation

When the heartbeat link between two nodes fails, each node may think the other is down, leading to duplicate VIP ownership or service interruption. Causes include broken cables, NIC failures, firewall blocks, or mis‑configured multicast groups.

Mitigation strategies:

Use dual heartbeat links (serial + Ethernet).

Employ fencing devices (STONITH) to power off a failed node.

Allow heartbeat traffic through firewalls.

Configure distinct multicast groups or switch to unicast.

Monitor and alert on split‑brain conditions.

7. VRRP script integration for custom health checks

Define a script block:

vrrp_script check_nginx {
    script "/etc/keepalived/ng.sh"
    interval 5
    weight -30
    fall 2
    rise 2
}

Call it inside a VRRP instance:

vrrp_instance VI_1 {
    state MASTER
    interface ens33
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress { 192.168.91.188 }
    track_script { check_nginx }
}

Example ng.sh script:

#!/bin/bash
ng=$(ps -elf | grep nginx | grep -vc "grep|$$")
if [ $ng -eq 0 ]; then
    systemctl stop keepalived
fi

8. Additional notes

Keepalived can be combined with Nginx or HAProxy for reverse‑proxy load balancing, and with LVS for layer‑4 distribution. Proper sysctl tuning (e.g., disabling ARP ignore/announce on the VIP interface) is required for seamless failover.

图片
图片
图片
图片
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilityload balancingVRRPIPVSkeepalived
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.