Operations 10 min read

How Keepalived Enables High-Availability Load Balancing with VRRP

Keepalived, originally designed for LVS load balancing, provides VRRP-based high‑availability by managing LVS nodes, performing health checks, and offering failover for services like Nginx, HAProxy, and MySQL, while also addressing split‑brain scenarios and non‑preemptive configurations.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How Keepalived Enables High-Availability Load Balancing with VRRP

What Is Keepalived?

Keepalived was initially created to manage and monitor LVS (Linux Virtual Server) clusters, handling the status of service nodes. It later added VRRP (Virtual Router Redundancy Protocol) support, allowing it to provide high‑availability solutions for services such as Nginx, HAProxy, MySQL, and others. Besides managing LVS via ipvsadm, Keepalived performs health checks on LVS nodes and can act as a generic HA solution for network services.

Three Key Functions of Keepalived Service

1) Manage LVS load‑balancing software ipvsadm 2) Perform health checks on LVS cluster nodes 3) Provide high‑availability for system network services

How Keepalived Works

Keepalived Operation Diagram

Working Mechanism

Keepalived uses the VRRP protocol to compete for a virtual router role. All protocol messages are sent via IP multicast (address 224.0.0.18). A virtual router consists of a VRID (0‑255) and a set of IP addresses, presenting a known MAC address (00‑00‑5E‑00‑01‑{VRID}) to the outside. Regardless of which node is MASTER, the virtual IP and MAC remain the same, making the transition transparent to client hosts. The MASTER continuously sends VRRP multicast packets to indicate it is alive; BACKUP nodes refrain from taking over unless the MASTER fails or a higher‑priority node appears. VRRP communication is encrypted, though Keepalived currently recommends plain‑text authentication.

Failover Transfer Principle of Keepalived HA

During normal operation, the MASTER node repeatedly sends heartbeat multicast packets. If the MASTER fails and stops sending, BACKUP nodes detect the loss and trigger a takeover, acquiring the virtual IP and associated services. When the original MASTER recovers, it typically releases the resources and returns to the BACKUP role.

Keepalived Dual‑Master Mode

(No additional content provided.)

Keepalived No‑Preempt Mechanism (nopreempt)

By default, a recovered MASTER will preempt and become MASTER again, causing a second transition. To avoid this, the nopreempt option can be set on BACKUP nodes, forcing a recovered node to remain as BACKUP. This requires both nodes to have their state set to BACKUP and rely on priority values to decide the active MASTER.

Keepalived Split‑Brain

When the heartbeat link between two HA nodes fails, the VRRP group splits into two independent entities, each believing the other is down. Both may assume MASTER status, leading to resource contention and potential data corruption. Common mitigation strategies include adding redundant heartbeat links, using disk locks, or implementing a quorum/arbiter mechanism (e.g., a reference IP) to decide which node should remain active.

Causes of Split‑Brain

Typical causes include heartbeat link failure, network card or driver issues, switch failures, misconfigured ARP or firewall rules, inconsistent VRRP instance IDs, or mismatched virtual_router_id values.

Common Solutions for Split‑Brain

1) Use dual heartbeat connections (serial and Ethernet) for redundancy. 2) Deploy fencing devices (e.g., STONITH) to power‑off a non‑responsive node. 3) Implement monitoring and alerting to enable rapid manual or automated arbitration when split‑brain is detected.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Operationsload balancingfailoverVRRPkeepalived
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.