Databases 20 min read

Redis High Concurrency, High Availability, Replication, and Sentinel Deep Dive

This article explains how to achieve high‑throughput Redis caching with master‑slave replication, cluster scaling, disk‑less sync, incremental replication, and Sentinel‑based automatic failover to reach 99.99% availability and handle tens of millions of QPS.

Big Data Technology & Architecture

Jul 22, 2019

Redis High Concurrency, High Availability, Replication, and Sentinel Deep Dive

If you use Redis as a cache, you must consider scaling it across multiple machines, ensuring high concurrency, and preventing a single point of failure.

High concurrency : A typical master‑slave setup (one master, many slaves) can handle tens of thousands of QPS on a single master and up to 100k QPS on multiple slaves. When data size grows to dozens or hundreds of gigabytes, a Redis cluster is required, which can provide hundreds of thousands of QPS.

High availability : Adding Sentinel to a master‑slave deployment enables automatic master‑slave failover when any instance crashes.

Read‑write separation : Configure one master for writes and multiple slaves for reads; this increases overall request capacity.

Replication fundamentals : The master asynchronously copies data to slaves. Each slave receives a PSYNC command; if it reconnects, only missing data is transferred, otherwise a full resynchronization occurs. The master creates an RDB snapshot, streams it to slaves, and buffers subsequent writes in memory.

Partial resynchronization : Since Redis 2.8, a backlog in memory stores recent write commands. If a slave reconnects, the master continues from the stored replica offset, avoiding a full sync unless the offset is lost.

Disk‑less replication : By setting repl-diskless-sync and repl-diskless-sync-delay, the master creates the RDB file in memory and streams it directly to slaves without writing to disk.

Expired‑key handling : Slaves do not expire keys themselves; they rely on the master to send DEL commands when a key expires or is evicted.

Full replication flow : The master runs BGSAVE, sends the RDB file, then streams buffered write commands. Client‑output‑buffer‑limit settings control memory usage during replication.

Incremental replication : If the network breaks during a full sync, the master uses the backlog to send only the missing data based on the slave's replica offset.

Heartbeat : Masters send heartbeats every 10 seconds; slaves send them every second to detect failures.

Achieving 99.99% availability : Ensure each Redis instance has backups, use Sentinel for monitoring, notification, and automatic failover, and configure quorum and majority correctly (e.g., three Sentinel nodes with quorum = 2).

Sentinel basics : Sentinel monitors master/slave health, sends alerts, performs failover, and updates client configuration. It operates as a distributed cluster, requiring at least three instances for robustness.

Sentinel states : sdown (subjective down) occurs when a single Sentinel deems a master down; odown (objective down) is reached when a quorum of Sentinels agree.

Discovery mechanism : Sentinels use the __sentinel__:hello Pub/Sub channel to announce themselves and exchange monitoring configurations.

Slave self‑correction : Sentinels ensure slaves replicate the correct master and, after a failover, re‑attach slaves to the new master.

Election algorithm : When a master is odown, Sentinels select a slave to promote based on disconnection time, priority, replica offset, and run‑id.

Quorum and majority : Failover requires a quorum of Sentinels to consider the master odown and a majority to authorize the promotion; the required numbers depend on the Sentinel count.

Configuration epoch : The failover Sentinel obtains a unique configuration epoch (version) for the new master, which is propagated to other Sentinels via Pub/Sub to keep configurations consistent.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cache high availability Redis Replication sentinel

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.