How Keepalived Enables High-Availability Load Balancing with VRRP
Keepalived, originally designed for LVS load balancing, provides VRRP-based high‑availability by managing LVS nodes, performing health checks, and offering failover for services like Nginx, HAProxy, and MySQL, while also addressing split‑brain scenarios and non‑preemptive configurations.
What Is Keepalived?
Keepalived was initially created to manage and monitor LVS (Linux Virtual Server) clusters, handling the status of service nodes. It later added VRRP (Virtual Router Redundancy Protocol) support, allowing it to provide high‑availability solutions for services such as Nginx, HAProxy, MySQL, and others. Besides managing LVS via ipvsadm, Keepalived performs health checks on LVS nodes and can act as a generic HA solution for network services.
Three Key Functions of Keepalived Service
1) Manage LVS load‑balancing software ipvsadm 2) Perform health checks on LVS cluster nodes 3) Provide high‑availability for system network services
How Keepalived Works
Keepalived Operation Diagram
Working Mechanism
Keepalived uses the VRRP protocol to compete for a virtual router role. All protocol messages are sent via IP multicast (address 224.0.0.18). A virtual router consists of a VRID (0‑255) and a set of IP addresses, presenting a known MAC address (00‑00‑5E‑00‑01‑{VRID}) to the outside. Regardless of which node is MASTER, the virtual IP and MAC remain the same, making the transition transparent to client hosts. The MASTER continuously sends VRRP multicast packets to indicate it is alive; BACKUP nodes refrain from taking over unless the MASTER fails or a higher‑priority node appears. VRRP communication is encrypted, though Keepalived currently recommends plain‑text authentication.
Failover Transfer Principle of Keepalived HA
During normal operation, the MASTER node repeatedly sends heartbeat multicast packets. If the MASTER fails and stops sending, BACKUP nodes detect the loss and trigger a takeover, acquiring the virtual IP and associated services. When the original MASTER recovers, it typically releases the resources and returns to the BACKUP role.
Keepalived Dual‑Master Mode
(No additional content provided.)
Keepalived No‑Preempt Mechanism (nopreempt)
By default, a recovered MASTER will preempt and become MASTER again, causing a second transition. To avoid this, the nopreempt option can be set on BACKUP nodes, forcing a recovered node to remain as BACKUP. This requires both nodes to have their state set to BACKUP and rely on priority values to decide the active MASTER.
Keepalived Split‑Brain
When the heartbeat link between two HA nodes fails, the VRRP group splits into two independent entities, each believing the other is down. Both may assume MASTER status, leading to resource contention and potential data corruption. Common mitigation strategies include adding redundant heartbeat links, using disk locks, or implementing a quorum/arbiter mechanism (e.g., a reference IP) to decide which node should remain active.
Causes of Split‑Brain
Typical causes include heartbeat link failure, network card or driver issues, switch failures, misconfigured ARP or firewall rules, inconsistent VRRP instance IDs, or mismatched virtual_router_id values.
Common Solutions for Split‑Brain
1) Use dual heartbeat connections (serial and Ethernet) for redundancy. 2) Deploy fencing devices (e.g., STONITH) to power‑off a non‑responsive node. 3) Implement monitoring and alerting to enable rapid manual or automated arbitration when split‑brain is detected.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
