Databases 18 min read

How to Build a Stable, High‑Performance Redis Cluster from Scratch

This guide walks through the evolution of a Redis deployment—from a single‑node cache, through data persistence options, master‑slave replication, Sentinel automatic failover, and finally sharding with Redis Cluster—explaining each technique, its trade‑offs, and practical implementation steps.

dbaplus Community
dbaplus Community
dbaplus Community
How to Build a Stable, High‑Performance Redis Cluster from Scratch

1. Single‑Node Redis

Start with the simplest setup: a single Redis instance used as an in‑memory cache. The application reads data from MySQL, writes it to Redis, and reads from Redis for fast access. When the instance crashes, all traffic falls back to MySQL, causing a performance bottleneck and potential data loss if persistence is disabled.

2. Data Persistence

Redis offers two main persistence mechanisms:

RDB (snapshot) : Periodically writes a compressed binary snapshot of the dataset to disk. Advantages are small file size and low write frequency; the downside is potential data loss between snapshots.

AOF (Append‑Only File) : Logs every write command. It provides real‑time durability but the file can grow large. AOF supports three fsync policies: appendfsync always (synchronous), appendfsync everysec (background thread every second), and appendfsync no (OS handles syncing).

To mitigate AOF growth, Redis performs an AOF rewrite , creating a new compact file that contains only the latest state of each key.

Redis 4.0+ also supports hybrid persistence , which writes an RDB snapshot into the AOF file during rewrite, reducing AOF size while preserving durability.

3. Master‑Slave Replication (Multiple Replicas)

Deploy several Redis instances: one master handling writes and one or more slaves that continuously replicate the master’s data. Benefits include reduced downtime (a slave can be promoted to master) and improved read throughput by distributing read requests across slaves.

4. Sentinel Automatic Failover

Sentinel processes monitor the master’s health. If a master stops responding, sentinels coordinate a failover:

Each sentinel periodically pings the master.

If a majority of sentinels deem the master down, they elect a leader (using a Raft‑like consensus algorithm) to perform the promotion.

The elected sentinel promotes a slave to master and notifies clients.

This design reduces manual intervention but can suffer false positives when network partitions cause a sentinel to misjudge the master’s status. Deploying multiple sentinels and requiring a quorum mitigates this risk.

5. Sharding (Cluster)

When write traffic exceeds a single master’s capacity, split the dataset across multiple nodes:

Each node stores a distinct subset of keys.

A routing rule maps a given key to a specific node.

Two common approaches:

Client‑side sharding : The application (or a proxy layer) determines the target node. Open‑source proxies such as Twemproxy and Codis implement this pattern, allowing transparent scaling without changing client code.

Official Redis Cluster : Nodes communicate via the Gossip protocol to detect failures and automatically re‑balance slots. The cluster provides an SDK that handles key‑to‑node mapping, eliminating the need for external proxies.

Both solutions rely on Sentinel (or the cluster’s own failure detection) for high availability.

Summary

Starting from a single‑node cache, we explored persistence (RDB, AOF, hybrid), replication with manual failover, Sentinel’s automatic failover using consensus, and finally sharding via client‑side proxies or the official Redis Cluster. Each step improves data durability, reduces downtime, and scales read/write capacity, resulting in a stable, high‑performance Redis service.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

clusteringPersistenceReplicationsentinel
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.