Understanding Redis Master‑Slave Architecture: Principles, Topologies, and Replication Mechanisms
This article explains the fundamentals of Redis master‑slave architecture, covering why high‑availability replication is needed, the three topology patterns, the three replication modes (continuous, full, partial), the replication backlog buffer, PSYNC command handling, heartbeat mechanisms, and the practical considerations for operators and architects.
Redis master‑slave architecture provides high availability by keeping one primary node and one or more replica nodes that store identical data, ensuring service continuity when a node fails.
Why use master‑slave? A single Redis instance can cause service avalanche, slow failover, and data recovery issues under high load; replication mitigates these problems.
What is the architecture? Three common topologies exist: one‑master‑one‑slave, one‑master‑multiple‑slaves, and tree‑structured master‑slave where an intermediate slave also acts as a master for downstream slaves.
Configuration is as simple as adding slaveof <Master IP> <Master Port> to the replica’s config file.
Replication modes include:
Continuous replication : the master propagates every write command to all slaves and stores them in a replication backlog buffer.
Full replication : used for the initial sync; the master creates an RDB snapshot (via bgsave ), sends it to the slave, then streams buffered commands.
Partial replication : after a disconnection, the master sends only the missing commands from the backlog buffer based on the slave’s offset.
The replication backlog buffer is a fixed‑size circular queue (default 1 MB) that records recent write commands with byte‑level offsets, enabling efficient partial sync.
Synchronization starts with the slave sending psync {runId} {offset} . Depending on the runId match and offset presence in the backlog, the master replies with FULLRESYNC (full sync) or CONTINUE (partial sync).
Heartbeat mechanisms keep the connection alive: the master pings slaves every 10 seconds (configurable via repl-ping-slave-period ), while slaves acknowledge with REPLCONF ACK {offset} every second, allowing both sides to detect failures and measure latency.
Who should care? Operations engineers, architects, and developers who need reliable caching or data synchronization across services.
When to use it? For small‑to‑medium systems requiring read‑write separation; large‑scale deployments typically adopt Redis Cluster with sharding and Sentinel for automatic failover.
Potential issues include single‑point master failure, increased maintenance complexity, the need for manual client routing in read‑write separation, and the overhead of full syncs for large datasets.
In summary, the article provides both a macro view (who, when, why) and a micro view (topologies, replication types, backlog buffer, PSYNC workflow) of Redis master‑slave architecture, preparing readers for the next steps of deployment and performance testing.
Wukong Talks Architecture
Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.