How Redis 2.8 Introduces Partial Replication to Avoid Full Sync on Network Glitches
This article explains Redis’s master‑slave replication mechanism, compares the full‑copy process of Redis 2.4.16 with the partial‑copy improvements introduced in Redis 2.8, details the state machine, replication‑cron workflow, repl_backlog buffer, run‑id handling, and configuration options for optimizing partial synchronization during network interruptions.
Redis is an open‑source, BSD‑licensed key/value cache known for its speed and rich data structures. In large‑scale deployments such as JD.com, the default master‑slave setup with failover is common, but full synchronization on network glitches can be costly.
Full Replication in Redis 2.4.16
When a slave receives the SLAVEOF ip port command, it enters REDIS_REPL_CONNECT and begins periodic checks via serverCron() and replicationCron(). Once connected, the slave moves through states REDIS_REPL_CONNECTING , REDIS_REPL_TRANSFER , and finally receives a full RDB snapshot from the master.
The master, upon detecting a new slave, may start a background save (BGSAVE) to generate the RDB file. While the snapshot is being created, any write commands are buffered for the slave. After the snapshot finishes, the master sends the RDB file ( REDIS_REPL_SEND_BULK ) and then streams the buffered commands ( REDIS_REPL_ONLINE ).
If the network disconnects, the slave must restart the whole process, causing a full data transfer even for small updates.
Partial Replication in Redis 2.8
Redis 2.8 adds a runid field (generated by getRandomHexChars()) to uniquely identify each server instance. It also introduces several new replication state variables, enabling the master to send only the commands missed during a brief disconnection.
char runid[REDIS_RUN_ID_SIZE+1]; /* ID always different at every exec. */The master stores a circular buffer called repl_backlog (default 1 MiB) that holds recent write commands. When a slave reconnects, it sends its last known runid and reploff (the offset of the last command it received). If the master’s repl_backlog still contains the missing range (i.e., repl_backlog_off - reploff <= backlog length), it streams only those commands, avoiding a full RDB transfer.
State Machine Enhancements
Redis 2.8 adds the REDIS_RECIVE_PONG state, where the slave first pings the master before attempting partial sync. If the slave has no cached master info, it sends a placeholder runid "?" and offset "-1", prompting the master to fall back to a full sync.
Configuration Tweaks
The default 1 MiB repl_backlog may be insufficient for high‑traffic workloads. Administrators can adjust its size with the repl-backlog-size setting and control its expiration when no slaves are attached via repl-backlog-ttl.
Conclusion
By introducing runid tracking and a circular replication backlog, Redis 2.8 significantly reduces the overhead of master‑slave synchronization after transient network failures, allowing partial replication whenever possible. Properly sizing repl_backlog and tuning its TTL ensures the feature works effectively in production environments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
