Databases 19 min read

Redis Evolution: From Single Node to High‑Performance Cluster with Persistence and Sharding

This article walks through Redis's architectural journey, covering single‑node basics, data persistence strategies (RDB and AOF), master‑slave replication, automatic failover with Sentinel, consensus algorithms, and both client‑side and server‑side sharding to achieve high performance and high availability.

Efficient Ops
Efficient Ops
Efficient Ops
Redis Evolution: From Single Node to High‑Performance Cluster with Persistence and Sharding

Hello, I'm Kaito. This article discusses the architectural evolution of Redis, from a simple single‑node deployment to a stable, high‑performance cluster.

From the simplest: Single‑node Redis

Assume you have an application that needs caching. Deploying a single Redis instance lets the app write data to memory, achieving fast reads. This works well for small workloads, but as data grows, a single instance becomes a bottleneck and a single point of failure.

When the instance crashes, all traffic falls back to the backend database, causing severe load spikes.

Data Persistence: Safeguarding Data

To avoid data loss, Redis can persist memory data to disk. The simplest approach writes every write operation to both memory and disk, but this hurts performance because disk I/O is slower.

Redis instead uses a two‑step process: the main thread writes to memory and returns to the client, while a background thread handles disk writes.

Another option is snapshotting (RDB), where Redis periodically writes a compressed binary snapshot of the dataset to disk.

Redis also offers AOF (Append‑Only File), which logs every write command. AOF provides full data recovery but results in larger files and slower restores.

Choosing between RDB and AOF depends on data‑loss tolerance: use RDB for lower durability requirements, AOF when full data integrity is needed.

To keep AOF files from growing indefinitely, Redis can rewrite AOF files, discarding obsolete commands. Since Redis 4.0, hybrid persistence combines an RDB snapshot with subsequent AOF logs, reducing file size while retaining completeness.

Redis 4.0 and later support hybrid persistence.

Master‑Slave Replication: Multiple Copies

Deploy multiple Redis instances: one master handling writes and one or more slaves synchronizing data in real time. Slaves can serve read traffic, reducing load on the master and providing redundancy.

When the master fails, a manual promotion of a slave restores service, but this manual step still introduces downtime.

Sentinel: Automatic Failover

Sentinel processes continuously probe the master. If a timeout occurs, they coordinate to confirm the failure. Multiple Sentinels vote; only when a quorum agrees does a failover occur.

The elected Sentinel leader performs the master‑to‑slave promotion, ensuring a deterministic hand‑off.

This election process is a consensus algorithm (Raft‑style), requiring an odd number of Sentinels to tolerate failures.

Sharding Cluster: Horizontal Scaling

When write traffic outgrows a single master, shard the dataset across many Redis nodes. Each node stores a subset of keys, and a routing rule maps keys to the appropriate node.

Two sharding models exist:

Client‑side sharding: the application (or a library) determines the target node.

Server‑side sharding (proxy): a proxy layer handles routing, keeping the client unaware of the cluster topology.

Redis Cluster implements client‑side sharding with an integrated routing library, while open‑source solutions like Twemproxy and Codis provide proxy‑based sharding.

Redis Cluster includes built‑in Sentinel logic, so a separate Sentinel deployment is unnecessary.

Summary

Starting from a single‑node Redis, we introduced data persistence (RDB, AOF, hybrid), master‑slave replication, automatic failover with Sentinel (using consensus), and finally sharding clusters (client‑side and proxy‑based) to achieve high performance, high availability, and easy horizontal scalability.

These principles apply to other data systems as well: identify bottlenecks, add persistence, add replicas, automate failover, and scale out with sharding.

DatabaseShardingHigh AvailabilityRedisReplicationData Persistence
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.