Operations 18 min read

Redis Sentinel Deep Dive: High‑Availability Architecture & Automatic Failover

This article explains Redis Sentinel’s role as the official high‑availability solution, detailing its monitoring, notification, automatic failover mechanisms, discovery processes, connection types, down‑state classifications, failover steps, leader election, master selection rules, and data consistency guarantees.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Redis Sentinel Deep Dive: High‑Availability Architecture & Automatic Failover

What is Sentinel

Redis Sentinel is the official high‑availability (HA) solution recommended by Redis. When using a master‑slave setup, if the master fails, Redis itself does not automatically switch roles, but Sentinel runs as an independent process that monitors multiple master‑slave clusters and can perform automatic failover.

Sentinel Architecture

Sentinel Functions

1) Monitoring : Sentinel continuously checks whether your master and slave servers are operating correctly.

2) Notification : When a monitored Redis server encounters a problem, Sentinel can send notifications to administrators or other applications via its API.

3) Automatic Failover : If a master server stops working, Sentinel initiates an automatic failover, promoting one of its slaves to become the new master and reconfiguring the remaining slaves to replicate from the new master. Clients attempting to connect to the failed master are redirected to the new master.

Sentinel Discovery and Connections

Sentinel discovers master servers using a user‑provided configuration file.

Sentinel creates two network connections to each monitored master:

Command connection for sending commands to the master. Subscription connection for subscribing to a channel to discover other Sentinels monitoring the same master.

How Sentinel Discovers Other Slaves

Sentinel sends the INFO command to the master to automatically obtain the addresses of all slaves.

For each discovered slave, Sentinel creates a command connection and a subscription connection, similar to the master.

How Sentinel Discovers Other Sentinels

Sentinel announces its presence by sending a “HELLO” message (containing its IP, port, and ID) over the command connection to monitored masters and slaves. It also receives “HELLO” messages from other Sentinels via the subscription connection, thereby discovering peers monitoring the same master.

Key points:

A Sentinel can connect to multiple other Sentinels, checking each other's availability and exchanging information.

Sentinels automatically discover peers via the __sentinel__:hello channel without manual configuration.

Each Sentinel publishes its own state (IP, port, run‑id) every two seconds.

Connections Between Sentinels

Sentinels only create command connections with each other for communication; subscription connections are not needed because the master‑slave servers act as intermediaries for HELLO messages.

Subjectively Down (SDOWN): a single Sentinel’s judgment that a server is down.

Objectively Down (ODOWN): multiple Sentinels have marked the server SDOWN and have exchanged SENTINEL is‑master‑down‑by‑addr messages, confirming the down state.

Sentinel Down Detection

Sentinel uses PING to check server status. If a server fails to reply within the configured master‑down‑after‑milliseconds interval, it is marked SDOWN. Valid PING replies are +PONG, -LOADING, or -MASTERDOWN. Any other reply or no reply is considered non‑valid.

Only when a server continuously returns non‑valid replies for the entire master‑down‑after‑milliseconds period is it marked SDOWN.

Sentinel Failover Procedure

Detect that the master has entered ODOWN.

Conduct a leader election based on the Raft protocol.

If election fails, retry after twice the failover timeout.

Select a slave and promote it to master.

Send SLAVEOF NO ONE to the promoted slave.

Publish the new configuration to all other Sentinels via Pub/Sub.

Send SLAVEOF to the former master’s slaves so they replicate from the new master.

When all slaves start replicating the new master, the leader Sentinel ends the failover process.

After reconfiguration, Sentinel sends a CONFIG REWRITE command to the affected instance to persist the new settings.

Sentinel Master Selection Rules

Sentinel selects a new master using the following criteria:

Discard slaves that are SDOWN, disconnected, or whose last PING reply is older than five seconds.

Discard slaves whose connection to the failed master has been down for longer than ten times the down‑after option.

From the remaining slaves, choose the one with the highest replication offset; if offsets are equal or unavailable, select the slave with the smallest run‑id.

Sentinel Data Consistency

Sentinel’s automatic failover uses the Raft algorithm to elect a leader, ensuring that only one leader exists per epoch. This prevents multiple leaders from being elected simultaneously. Configuration updates follow a “last‑write‑wins” rule, so the newest configuration propagates to all Sentinels.

During network partitions, a Sentinel with an older configuration will update itself when it receives a newer version from peers. To maintain consistency under partitions, set min‑slaves‑to‑write so the master stops writes when the number of connected slaves falls below a threshold, and run Sentinel on every Redis node.

Sentinel Practical Usage

Environment Preparation

# 1. Create Sentinel configuration directory
mkdir -p /etc/redis-sentinel/26380

# 2. Create configuration file
vim /etc/redis-sentinel/26380/sentinel.conf
port 26380
dir "/etc/redis-sentinel/26380"
sentinel monitor zls 172.16.1.52 6379 1
sentinel down-after-milliseconds zls 5000

# 3. Start Sentinel
redis-sentinel /etc/redis-sentinel/26380/sentinel.conf &

Sentinel Related Commands

# Check master liveness
127.0.0.1:26380> ping
PONG

# List monitored masters
127.0.0.1:26380> SENTINEL MASTERS

# List slaves of a master
127.0.0.1:26380> SENTINEL slaves zls

# Get master address by name
127.0.0.1:26380> SENTINEL get-master-addr-by-name zls
1) "172.16.1.51"
2) "6379"

# Manual failover
127.0.0.1:26380> SENTINEL FAILOVER zls

# Reset master configuration
127.0.0.1:26380> SENTINEL reset zls
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringOperationshigh availabilityredissentinelfailover
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.