Master‑Slave Replication in Redis: How It Works and How to Prevent Data Loss
This article explains why a single‑instance Redis can cause outages, introduces the master‑slave architecture, details the full and incremental synchronization processes, shows how to configure replication, addresses multi‑slave scaling, network interruptions, and automatic failover with Sentinel.
Hello, I’m Tom.
A reader recently told me his company's Redis crashed, causing service disruption and an angry boss, and he feared being fired.
He was surprised that even with a replica, the outage still impacted the business because their architecture used only a single Redis instance; when the primary node went down, both data and service were lost.
What is Master‑Slave?
Master‑slave (also called primary‑replica) involves deploying multiple Redis instances, where the master handles all write operations and reads, while each replica continuously syncs data from the master and serves read requests.
Why Can’t Replicas Write?
Allowing writes on replicas would require a global lock to maintain consistency across the cluster, which is costly and complex. Therefore, the common design keeps writes on the master and reads on the replicas.
Two Data Synchronization Methods
Redis provides:
RDB – full data synchronization (snapshot).
AOF – incremental synchronization by replaying the write‑ahead log.
Establishing a Master‑Slave Relationship
Start two Redis instances with IPs 192.168.0.1 and 192.168.0.2. On the replica host ( 192.168.0.2) run: replicaof 192.168.0.1 6379 After the command, 192.168.0.2 becomes a replica of 192.168.0.1.
Master‑Slave Synchronization Steps
1. Initial PSYNC request : The replica sends psync with the master’s run‑ID and offset (initially ? and -1).
Each Redis instance generates a random ID at startup.
The master replies with FULLRESYNC and its run‑ID and offset, indicating a full copy is needed.
2. Full sync :
The master forks a child process and runs bgsave to create an RDB file.
The RDB file is transferred to the replica.
The replica clears its database and loads the RDB.
During bgsave the master’s main thread is blocked, but once the RDB is generated the master can continue serving requests; subsequent writes are stored in the replication buffer.
3. Incremental sync : The master streams new write commands to the replica, which replays them to stay up‑to‑date.
Scaling to Multiple Replicas
When many replicas exist, each full sync requires the master to fork a child process and generate an RDB, causing high CPU and network load. A “master‑replica‑replica” topology can reduce pressure by having one replica act as an intermediate sync source.
Impact of Network Interruptions
Redis uses a circular repl_backlog_buffer to store incremental commands. The master’s write offset is master_repl_offset; the replica’s read offset is slave_repl_offset. If the master produces data faster than the replica consumes it, the buffer may wrap around, causing the replica to miss data.
The circular buffer design allows space reuse, similar to a dashcam overwriting old footage.
When the replica’s requested offset is no longer in the buffer, a full resynchronization is triggered.
Solutions to Buffer Overrun
1. Increase repl_backlog_size (controlled by repl_backlog_buffer) to accommodate higher write rates.
2. Force a full resync, which copies the entire dataset again, filling the gap.
Automatic Failover with Sentinel
If the master crashes, manual promotion is slow and unsuitable for production. Sentinel automates failover through three stages:
Monitoring : Sentinels ping masters and replicas; lack of response marks a node as down.
Election : Sentinels vote based on replica priority, replication offset, and instance ID to choose a new master.
Notification : The new master is announced, and all replicas execute replicaof to re‑establish replication.
Using multiple Sentinel instances reduces false positives caused by network jitter.
NiuNiu MaTe
Joined Tencent (nicknamed "Goose Factory") through campus recruitment at a second‑tier university. Career path: Tencent → foreign firm → ByteDance → Tencent. Started as an interviewer at the foreign firm and hopes to help others.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
