Databases 14 min read

Designing a Highly Available Redis Service with Sentinel and Multi‑Sentinel Architecture

This article explains how to define high availability for Redis, enumerates typical failure scenarios, compares four deployment patterns—from a single instance to a three‑sentinel setup—and provides practical steps, diagrams, and tips for achieving reliable Redis service using Sentinel and virtual IP failover.

ITPUB
ITPUB
ITPUB
Designing a Highly Available Redis Service with Sentinel and Multi‑Sentinel Architecture

High‑availability definition for Redis

To be considered highly available a Redis service must keep serving requests (or recover within a short time) when any one of the following single‑point failures occurs:

Failure 1 : a Redis process on a node crashes.

Failure 2 : an entire node (all processes on the server) becomes unavailable.

Failure 3 : the network link between two nodes is broken, causing a partition.

The design goal is to tolerate any one of these events; the probability of two independent failures happening simultaneously is assumed negligible.

Evolution of the deployment architecture

Solution 1 – Single‑node Redis (no Sentinel)

A single Redis instance provides the simplest setup but is a classic single‑point‑of‑failure. If the process or the host crashes, the service stops and any in‑memory data is lost unless persistence is enabled.

Solution 2 – Master/Slave with one Sentinel

Introduce a slave Redis and a single Sentinel process that monitors both instances. Clients query the Sentinel to discover the current master. The Sentinel can promote the slave when the master fails, but the Sentinel itself becomes a single point of failure, so true HA is not achieved.

Solution 3 – Master/Slave with two Sentinels

Deploy two Sentinel instances so that a client can still obtain master information if one Sentinel dies. However, when an entire node fails (e.g., server 1), only one Sentinel remains reachable. Redis Sentinel requires a strict majority (> 50 %) of Sentinels to be online to perform a failover, so the remaining single Sentinel cannot promote the slave and the service becomes unavailable.

Solution 4 – Master/Slave with three Sentinels (final architecture)

Adding a third Sentinel on a separate server creates a quorum of three. With three Sentinels, any single‑process crash, single‑machine outage, or single network partition still leaves a majority of Sentinels online, allowing automatic failover and continuous service.

Optionally a fourth server can host an additional slave, forming a 1‑master + 2‑slave topology for extra data redundancy (at the cost of higher replication traffic).

Key configuration details

Typical Redis instance start command: redis-server /etc/redis/redis.conf Typical Sentinel configuration (e.g., /etc/redis/sentinel.conf) includes:

port 26379
sentinel monitor mymaster 10.0.0.1 6379 2   # name, master IP, master port, quorum
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1

Start each Sentinel with: redis-sentinel /etc/redis/sentinel.conf When three Sentinels are running on three different hosts, the quorum value of 2 ensures that any single failure still leaves a majority (2/3) online, satisfying the >50 % rule.

Client integration

Clients must use a Redis library that supports Sentinel discovery (e.g., ioredis for Node.js, go-redis/redis for Go, jedis for Java, predis for PHP). The library is given a list of Sentinel addresses; it contacts the first reachable Sentinel, asks for the current master, and then connects directly to that master.

Providing a single endpoint with a Virtual IP

To keep the client experience identical to a single‑node deployment, a virtual IP (VIP) can be bound to the current master. When a failover occurs, a keepalived or similar script moves the VIP to the newly promoted master. Clients continue to use the same IP and port, unaware of the underlying topology change.

Operational considerations

Run a process supervisor (e.g., supervisord) to automatically restart Redis or Sentinel processes after crashes.

Configure min‑slaves‑to‑write and min‑slaves‑max‑lag on the master to avoid accepting writes when the replication lag exceeds a safe threshold, reducing the risk of data loss during network partitions.

If resources are limited, a Sentinel can be placed on a client machine, but this mixes responsibilities and may cause operational friction between service‑provider and consumer teams.

Summary

Building a usable Redis service is trivial; achieving high availability requires:

At least two Redis instances (master + slave).

Three Sentinel processes on three distinct hosts to form a majority quorum.

Client libraries that understand Sentinel discovery.

An optional virtual IP (managed by keepalived) to present a single endpoint.

Process supervision to auto‑restart crashed components.

This combination tolerates any single‑point failure while keeping the service continuously reachable.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

architecturedatabaseredissentinelfailover
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.