Mastering Redis Replication: From Sync Process to Partial and Full Copy
This article provides a comprehensive walkthrough of Redis replication, detailing the step‑by‑step sync process, data synchronization mechanisms, full and partial copy workflows, heartbeat design, and asynchronous replication, while highlighting key commands like PSYNC and their practical implications.
1. Replication Process
Step 1: The slave node executes the SLAVEOF command.
Step 2: The slave records the master information from the command but does not start replication immediately.
Step 3: A periodic task on the slave detects the master info and opens a socket connection.
Step 4: After a successful connection, the slave sends a PING and expects a PONG; otherwise it retries.
Step 5: If the master requires authentication, the slave must pass it; failure aborts replication.
Step 6: Once authenticated, the master sends all data to the slave – the longest part of the process.
Step 7: After the master finishes sending data, the replication link is established and the master continuously forwards write commands to keep data consistent.
2. Data Synchronization
The synchronization step discussed above is referred to as “data synchronization”. Redis uses two commands for this: SYNC (pre‑2.8) and PSYNC (introduced in 2.8). The focus here is on PSYNC.
2.1 Components Required for PSYNC
Replication offsets of both master and slave.
Master's replication backlog buffer.
Master's run ID.
2.2 Master and Slave Replication Offsets
Both nodes maintain their own replication offsets.
The master records the byte length of processed write commands in master_repl_offset (info replication).
The slave reports its offset to the master every second; the master also stores the slave’s offset.
The slave updates its own offset when it receives commands from the master.
Comparing offsets determines data consistency.
2.3 Master Replication Backlog Buffer
The backlog is a fixed‑size FIFO queue (default 1 MB) on the master.
It is created when a slave connects; the master writes commands to both the slave and the backlog.
The backlog helps with partial replication and recovery of lost commands; its status is visible via INFO REPLICATION.
2.4 Master Run ID
Each Redis instance generates a 40‑character run ID at startup.
The run ID uniquely identifies a node. If the master restarts and its run ID changes, slaves will trigger a full sync.
2.5 Keeping Run ID on Restart
Using DEBUG RELOAD can reload the RDB while preserving the run ID, avoiding unnecessary full syncs.
However, DEBUG RELOAD blocks the main thread, so it should be used cautiously.
2.6 PSYNC Command Usage
Format:
PSYNC {runId} {offset} runId: the master’s run ID. offset: the slave’s current replication offset.
2.7 PSYNC Execution Flow
The slave sends PSYNC to the master. The master replies: +FULLRESYNC {runId} {offset}: triggers a full sync. +CONTINUE: triggers partial sync. +ERR: master does not support PSYNC (fallback to SYNC for full sync).
3. Full Synchronization
Full sync is the original replication method required when a master‑slave relationship is first established. It is triggered by SYNC or PSYNC (post‑2.8). The steps are:
Slave sends PSYNC (or SYNC).
Master replies with FULLRESYNC.
Slave records the master’s ID and offset.
Master performs BGSAVE and creates an RDB file.
Master sends the RDB file to the slave.
Slave loads the RDB into memory.
During loading, the master buffers new writes in the replication backlog.
After loading, the slave may flush data; large RDB files can cause delays, mitigated by slave‑server‑stale‑data.
If AOF is enabled, the slave immediately runs BGREWRITEAOF.
Key performance bottlenecks are highlighted in bold steps.
If the RDB exceeds 6 GB on a gigabit network, the default 60 s timeout may cause failure; increase repl-timeout to fix.
Disk‑less replication (sending data directly over the network) is not production‑ready.
4. Partial Synchronization
When a network glitch occurs during replication, the master can resend missing commands from the backlog, which is far cheaper than a full sync.
If the slave is disconnected longer than repl-timeout, the master aborts the connection.
The master writes pending data to the 1 MB backlog.
When the slave reconnects, it sends its offset and master ID.
If the needed data is still in the backlog, the master replies +CONTINUE for partial sync.
The master then streams the buffered data to the slave.
5. Heartbeat Mechanism
After replication is established, master and slave maintain a long‑living connection and exchange heartbeat commands.
Both sides simulate each other as clients; CLIENT LIST shows master flags = M and slave flags = S.
Master sends PING every 10 seconds (configurable via repl-ping-slave-period).
Slave sends REPLCONF ACK {offset} every second to report its offset.
If the master does not receive a heartbeat within repl-timeout (default 60 s), it marks the slave as offline.
Deploying master and slave in the same data center reduces latency and heartbeat interruptions.
6. Asynchronous Replication
The master processes write commands and immediately returns to the client, while asynchronously forwarding the write to slaves.
Master receives and processes the command.
Master returns the response to the client.
For write commands, the master asynchronously sends them to the slave, which executes them in its main thread.
7. Summary
The article analyzed Redis replication principles, covering the replication process, data synchronization, full and partial copy workflows, heartbeat design, and asynchronous replication. It emphasized that RDB synchronization can be time‑consuming and highlighted the importance of the PSYNC command and backlog size for efficient incremental replication.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
