Operations 11 min read

How Redis Sentinel Ensures Automatic Failover and High Availability

Redis Sentinel provides automatic monitoring, fault detection, and failover for Redis master‑slave clusters, enabling high availability by electing a new master when the original fails, using sdown/odown states, quorum voting, and pub/sub communication to keep services running with minimal downtime.

Architecture & Thinking

Apr 10, 2024

How Redis Sentinel Ensures Automatic Failover and High Availability

Redis Sentinel is a high‑availability mechanism for Redis that monitors master‑slave clusters and automatically performs failover when the master node becomes unavailable.

Overall Introduction

The master‑slave architecture improves Redis availability by isolating failures and allowing read/write separation. It provides fault isolation, recovery, read‑write isolation, and forms the basis for Sentinel and cluster modes.

If a slave fails, all operations are routed to the master.

If the master fails, the slave is promoted to master, losing only the data not yet synchronized.

However, manual failover can be slow, leading to high MTTR and RTO. Sentinel addresses this by automating detection and promotion.

What is Sentinel Mode

Sentinel extends the master‑slave model with health checks, election, and automatic switch‑over. It runs sentinel processes that monitor Redis instances and, upon master failure, elect a new master.

Sentinel Responsibilities

Cluster monitoring : Periodically checks health of master and slaves.

Fault detection and notification : Detects failures and alerts other sentinels.

Automatic failover : Promotes a healthy slave to master and updates configuration.

Cluster Monitoring

Sentinel sends a PING to the master every second and issues INFO to retrieve the state of all nodes.

Monitoring and Communication Logic

Sentinel ↔ Master : Sends PING and INFO commands; master records connected slaves and sentinels.

Sentinel ↔ Slave : Retrieves slave list via INFO, then pings each slave.

Sentinel ↔ Sentinel : Uses Redis pub/sub channel __sentinel__:hello for inter‑sentinel communication and slave discovery.

Marking Nodes Offline

If a slave does not respond to PING within the timeout, Sentinel tags it as offline.

If the master fails to respond, it is also marked offline, preparing for automatic switch‑over.

Master‑Slave Dynamic Switch (Failover)

Sentinel detects master unresponsiveness and marks it sdown.

Sentinel broadcasts sentinel is-master-down-by-address-port to other sentinels.

Other sentinels also mark the master sdown and broadcast the same command.

When a quorum of sentinels agrees, the master is marked odown (objectively down).

The elected sentinel selects a new master based on response time, replication offset ( slave_repl_offset vs master_repl_offset), and runID.

Information Notification

After a new master is elected, Sentinel updates configuration and notifies clients so that all write operations are directed to the new master and slaves resynchronize using the new runID and slave_repl_offset.

Summary

Redis Sentinel provides four core capabilities: cluster monitoring, fault detection and notification, automatic failover, and configuration update with client notification, ensuring high availability for Redis deployments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring high availability sentinel failover

Written by

Architecture & Thinking

🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.