Operations 4 min read

Understanding Redis Sentinel: High‑Availability Mechanism and Failover Process

The article explains how Redis Sentinel provides high availability by monitoring master‑slave instances, detecting failures through periodic pings, distinguishing subjective and objective down states, performing quorum arbitration, and automatically promoting a slave to master to ensure continuous service.

Practical DevOps Architecture

Jun 28, 2022

Understanding Redis Sentinel: High‑Availability Mechanism and Failover Process

In a master‑slave setup, the master handles write requests while slaves serve reads; manual promotion after a master failure is cumbersome, but Redis Sentinel offers an automated high‑availability solution.

Sentinel nodes are special Redis services that do not handle client reads/writes but continuously monitor other Redis instances. Clients initially query Sentinel to discover the current master and thereafter connect directly to it; when the master fails, Sentinel detects the outage, selects a new master from the slaves, and notifies clients, achieving seamless failover.

Sentinel runs three internal periodic tasks: every 1 second each Sentinel pings other Sentinels and Redis nodes; every 2 seconds each Sentinel exchanges information with the master via publish/subscribe channels; every 10 seconds each Sentinel issues INFO commands to both master and slaves.

Subjective down (SDOWN) occurs when a single Sentinel judges a server as down, while objective down (ODOWN) is reached when a majority of Sentinels agree on the SDOWN status and coordinate a failover.

Arbitration is controlled by the quorum setting in the configuration, typically set to half the number of Sentinels plus one (e.g., 2 for a 3‑Sentinel deployment).

The Sentinel failover workflow includes: (1) per‑second ping checks; (2) marking a server as SDOWN if ping replies exceed down-after-milliseconds; (3) confirming SDOWN status; (4) meeting quorum to declare ODOWN; (5) voting to elect a new master and replicating data; (6) adjusting INFO command intervals from 10 seconds to 1 second to accelerate detection, after which the master’s SDOWN flag is cleared if it responds.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Operations Master‑Slave failover high-availability

Written by

Practical DevOps Architecture

Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.