Databases 9 min read

Why Redis Replication Can Lose Data and How to Fix It

This article explains the causes of partial resynchronization failures, master‑slave data inconsistency, latency, dirty data, and data‑safety risks in Redis replication, and provides concrete configuration commands and mitigation strategies to ensure reliable data consistency.

JavaEdge
JavaEdge
JavaEdge
Why Redis Replication Can Lose Data and How to Fix It

1. Partial Resynchronization After Failover

Since Redis 4.0, when a replica is promoted to master after a failover it can perform a partial resynchronization with its former master because it retains the old replication ID and offset, allowing the old master to supply the missing data.

After an upgrade the replica receives a new replication ID, which represents a different data‑set history; using the old ID would violate the rule that a replication ID/offset pair can identify only a single data set.

When a replica shuts down and restarts, it stores the necessary information in its RDB file so that it can resync with the master later. It is recommended to use the SHUTDOWN command to safely persist and exit the replica.

2. Master‑Slave Data Inconsistency

2.1 Master More Than Slaves

Partial resynchronization can be triggered with the command PSYNC master_run_id offset.

2.2 Slaves More Than Master

Full copy occurs because replication is in read‑write mode. The remedy is to either disable the read‑write mode of the replica or delete the replica’s data and perform a full copy from the master.

3. Data Latency

It is advisable to write an external program that monitors the replication offset of master and slave nodes; when the latency exceeds a threshold, the program should raise an alarm or switch client reads to the master or another node.

Set slave-serve-stale-data = no (or replica-serve-stale-data = no) so that a replica rejects most commands during synchronization, returning the error “SYNC with master in progress”.

When replica-serve-stale-data is yes (default), the replica may serve stale or empty data on the first sync.

When set to no, only the following commands are allowed: INFO, REPLICAOF, AUTH, PING, SHUTDOWN, REPLCONF, ROLE, CONFIG, SUBSCRIBE, UNSUBSCRIBE, PSUBSCRIBE, PUNSUBSCRIBE, PUBLISH, PUBSUB, COMMAND, POST, HOST, LATENCY.

4. Dirty Data

4.1 Causes

Redis deletion strategies—lazy deletion, periodic deletion, and active eviction—can cause replicas to read expired keys, resulting in dirty data.

Lazy deletion: each read checks the key’s expiration; if expired, Redis deletes the key and asynchronously replicates the DEL command to replicas.

Periodic deletion: the master periodically samples a number of keys, deletes those that are expired, and replicates the DEL to each replica.

Active eviction: when used memory exceeds maxmemory, Redis evicts keys according to its eviction policy.

4.2 Solutions

Ignore the inconsistency for read‑only scenarios where occasional stale reads are acceptable.

Force critical reads to the master (read‑through) to guarantee data correctness.

Configure replicas as read‑only to prevent them from writing dirty data.

Enable Redis’s own optimizations; for example, Redis 3.2 checks key expiration on replica reads before returning data.

5. Data Safety

5.1 Disabling Master Persistence

Turning off the master’s persistence (e.g., disabling bgsave and RDB snapshots) can improve performance because all data is persisted on the replicas, but it introduces a risk: if the master restarts, its in‑memory dataset may be empty, causing data loss.

The recommended practice is to keep persistence enabled on both master and replicas; if persistence must be disabled due to slow disks, also disable the master’s automatic restart mechanism.

5.2 Risk Scenario

Master persistence is disabled; replicas copy data from the master, while the master holds only in‑memory data.

The master crashes and restarts; because persistence is off, its dataset is empty.

The master’s run ID changes, causing replicas to reconnect and perform a full synchronization, which wipes their previously stored data.

5.3 Mitigation

Re‑enable master persistence, accepting the performance overhead.

If persistence must remain disabled, prevent automatic restart of the master (e.g., avoid Docker or script‑based auto‑restart mechanisms).

slave-serve-stale-data = no
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

RedisReplicationPartial Sync
JavaEdge
Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.