Operations 13 min read

Building a Highly Available Redis Service with Sentinel and Virtual IP

This article explains how to design and deploy a fault‑tolerant Redis architecture using master‑slave replication, multiple Sentinel instances, and a virtual IP so that the service remains available despite process crashes, server outages, or network partitions.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Building a Highly Available Redis Service with Sentinel and Virtual IP

In‑memory Redis has become the most widely used key‑value database in web development, often employed for session storage, caching hot data, simple message queues (LPUSH/BRPOP), and publish/subscribe systems; large internet companies typically expose Redis as a foundational service to internal teams.

Service providers are frequently asked whether their Redis offering is highly available, and the author shares a small‑scale HA Redis deployment built for a recent project.

High availability for Redis means the service continues to operate during anomalies—or recovers within a very short time. Typical anomalies include a single Redis process crashing, an entire node going down, or a network partition between nodes.

These events are low‑probability, and the core HA principle is that the system should tolerate any single‑point failure, since the chance of multiple independent failures occurring simultaneously is negligible.

Common HA solutions such as Keepalived, Codis, Twemproxy, and Redis Sentinel exist; for modest data volumes the author chose the official Redis Sentinel solution.

Redis Sentinel monitors Redis servers and automatically promotes a slave to master when the current master fails, providing seamless failover for clients.

Scheme 1: Single Redis server without Sentinel

This simple setup works for personal projects but suffers from a single point of failure: if the Redis process or its host crashes, the service becomes unavailable and any non‑persistent data is lost.

Scheme 2: Master‑slave replication with a single Sentinel

Adding a slave eliminates the master‑only failure, and a Sentinel process watches the two Redis instances to promote the slave when needed. However, the Sentinel itself is a single point of failure—if it crashes, clients cannot discover the current master.

Scheme 3: Master‑slave replication with two Sentinel instances

Running two Sentinels allows a client to contact either one, but Redis requires a majority (over 50%) of Sentinels to be reachable to perform a failover. If one server (and its Sentinel) goes down, only 50% remain, which is insufficient for automatic master promotion.

Redis enforces this 50% rule to avoid split‑brain scenarios where two masters could accept writes simultaneously, leading to data inconsistency.

Scheme 4: Master‑slave replication with three Sentinel instances

By adding a third server and a third Sentinel, the architecture can survive a single process failure, an entire machine failure, or a network partition between any two machines, because a majority of Sentinels (two out of three) will always be reachable to elect a new master.

If desired, a Redis instance can also be deployed on the third server, creating a 1‑master + 2‑slave topology for additional redundancy, though more slaves increase replication latency.

When a server loses network connectivity, the remaining Sentinels promote the surviving slave to master; during the brief interval two masters may exist, and any writes that occurred on the isolated node could be lost. Configuring min‑slaves‑to‑write and min‑slaves‑max‑lag can mitigate this by refusing writes when the slave count falls below a safe threshold.

Although placing a Sentinel on the client side could reduce the number of servers needed, organizational boundaries often make this impractical, so the three‑server, three‑Sentinel design is preferred.

Usability: Making Sentinel appear like a single‑node Redis

Clients typically prefer to connect to a single IP and port. By introducing a Virtual IP (VIP) that always points to the current master, and moving the VIP via a failover script when a master‑slave switch occurs, the client experience remains identical to a standalone Redis instance.

Conclusion

Deploying a basic Redis service is straightforward, but achieving high availability introduces complexity: additional servers, multiple Sentinel processes, and a slave instance are required to survive low‑probability failures. In practice, a supervisor process monitors Redis and Sentinel, automatically restarting them if they exit unexpectedly.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilityredissentinelService Architecturevirtual IP
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.