How Redis Sentinel Ensures Automatic Failover and High Availability
Redis Sentinel provides automatic monitoring, fault detection, and failover for Redis master‑slave clusters, enabling high availability by electing a new master when the original fails, using sdown/odown states, quorum voting, and pub/sub communication to keep services running with minimal downtime.
Redis Sentinel is a high‑availability mechanism for Redis that monitors master‑slave clusters and automatically performs failover when the master node becomes unavailable.
Overall Introduction
The master‑slave architecture improves Redis availability by isolating failures and allowing read/write separation. It provides fault isolation, recovery, read‑write isolation, and forms the basis for Sentinel and cluster modes.
If a slave fails, all operations are routed to the master.
If the master fails, the slave is promoted to master, losing only the data not yet synchronized.
However, manual failover can be slow, leading to high MTTR and RTO. Sentinel addresses this by automating detection and promotion.
What is Sentinel Mode
Sentinel extends the master‑slave model with health checks, election, and automatic switch‑over. It runs sentinel processes that monitor Redis instances and, upon master failure, elect a new master.
Sentinel Responsibilities
Cluster monitoring : Periodically checks health of master and slaves.
Fault detection and notification : Detects failures and alerts other sentinels.
Automatic failover : Promotes a healthy slave to master and updates configuration.
Cluster Monitoring
Sentinel sends a PING to the master every second and issues INFO to retrieve the state of all nodes.
Monitoring and Communication Logic
Sentinel ↔ Master : Sends PING and INFO commands; master records connected slaves and sentinels.
Sentinel ↔ Slave : Retrieves slave list via INFO, then pings each slave.
Sentinel ↔ Sentinel : Uses Redis pub/sub channel __sentinel__:hello for inter‑sentinel communication and slave discovery.
Marking Nodes Offline
If a slave does not respond to PING within the timeout, Sentinel tags it as offline.
If the master fails to respond, it is also marked offline, preparing for automatic switch‑over.
Master‑Slave Dynamic Switch (Failover)
Sentinel detects master unresponsiveness and marks it sdown.
Sentinel broadcasts sentinel is-master-down-by-address-port to other sentinels.
Other sentinels also mark the master sdown and broadcast the same command.
When a quorum of sentinels agrees, the master is marked odown (objectively down).
The elected sentinel selects a new master based on response time, replication offset ( slave_repl_offset vs master_repl_offset ), and runID.
Information Notification
After a new master is elected, Sentinel updates configuration and notifies clients so that all write operations are directed to the new master and slaves resynchronize using the new runID and slave_repl_offset .
Summary
Redis Sentinel provides four core capabilities: cluster monitoring, fault detection and notification, automatic failover, and configuration update with client notification, ensuring high availability for Redis deployments.
Architecture & Thinking
🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.