From Single Node to Tank: 20 Diagrams of Redis Architecture Evolution
This article walks through Redis's architectural journey—from a lone instance to a high‑availability, high‑performance cluster—covering persistence (RDB, AOF, hybrid), master‑slave replication, Sentinel automatic failover, sharding strategies, and the modern Redis Cluster design.
Single‑node Redis
Starting with the simplest deployment, a single Redis instance serves as an in‑memory cache for a MySQL‑backed application. When the instance crashes, all traffic falls back to MySQL, causing a massive load spike, and any data not persisted to disk is lost.
Data Persistence
To avoid data loss, Redis can write memory contents to disk. The naïve approach writes each command to both memory and disk, but disk I/O is far slower than memory writes and degrades performance.
Redis separates the write path into two steps: (1) writing to the OS page cache ( write) and (2) flushing the cache to disk ( fsync).
Redis solves the performance problem by letting the main thread return to the client after the memory write, while a background thread performs the disk write. This is the basis of the Append‑Only File (AOF) mechanism, which offers three fsync policies:
appendfsync always – main thread syncs every write
appendfsync no – OS handles syncing
appendfsync everysec – background thread syncs once per second
AOF files grow over time; Redis mitigates this with AOF rewrite, which compacts the log by keeping only the latest state of each key.
For cache‑only workloads where occasional data loss is acceptable, Redis can take periodic snapshots (RDB). Snapshots write the entire dataset to a binary, compressed file at configured intervals, resulting in small files and low write frequency.
Hybrid persistence (available from Redis 4.0) combines the two: during AOF rewrite Redis first writes an RDB‑style snapshot into the AOF file, then appends subsequent commands. This yields the durability of AOF with a much smaller file size.
Master‑Slave Replication (Multiple Replicas)
Deploying several Redis instances creates a master‑slave topology. The master handles writes, while slaves replicate data in real time and can serve read traffic, reducing load on the master and shortening downtime when the master fails (a manual promotion is required).
The manual failover introduces human reaction time, which can still impact the application.
Sentinel Automatic Failover
Sentinel continuously pings the master. If a timeout occurs, Sentinel declares the master unhealthy and initiates a failover. Multiple Sentinels elect a leader (using a Raft‑style consensus algorithm) to perform the promotion, reducing false positives caused by network glitches.
Redis 4.0+ supports hybrid persistence. Note: Hybrid persistence is an optimization of AOF rewrite and therefore requires AOF + AOF rewrite to be enabled.
Sentinel’s leader election follows three steps: each Sentinel requests votes from peers, votes for the first requester, and the candidate that gathers a majority becomes the leader and triggers the master‑slave switch.
Sharding (Horizontal Scaling)
When write traffic exceeds a single master’s capacity, data can be split across many instances. Two common approaches exist:
Client‑side sharding – the application routes keys to specific nodes based on a hash function. This couples routing logic to business code.
Proxy‑based sharding – a middle‑layer (e.g., Twemproxy, Codis) holds the routing rules, keeping client code simple.
Open‑source proxy solutions let the client talk to a single endpoint; the proxy forwards requests to the appropriate Redis node and can add new nodes transparently.
Redis 3.0 introduced an official Redis Cluster that uses a gossip protocol for health checks and automatic failover, eliminating the need for Sentinel. Clients use a compatible SDK that handles key‑to‑node routing and adapts to topology changes.
Because upgrading legacy clients to the new SDK can be costly, many companies build a proxy in front of the cluster. The proxy hides topology changes from the client, allowing a simple address switch to migrate to Redis Cluster.
Summary
Data loss risk → use persistence (RDB/AOF).
Long recovery time → master‑slave replicas for quick failover.
Manual failover latency → Sentinel automatic failover.
Read pressure → add read replicas (read‑write separation).
Write pressure / capacity bottleneck → sharding cluster.
Community sharding solutions (Twemproxy, Codis) rely on external proxies and Sentinel.
Official Redis Cluster (gossip protocol) removes Sentinel, supports horizontal scaling.
Proxy + Redis Cluster reduces client‑side changes for legacy applications.
Following this 0‑to‑1 then 1‑to‑N roadmap yields a stable, high‑performance Redis deployment capable of handling large‑scale traffic.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer XiaoFu
xiaofucode.com – a programmer learning guide driven by the pursuit of profit
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
