Databases 5 min read

Why Redis Cluster Can Lose Data and How to Mitigate It

Redis Cluster does not guarantee strong consistency, and in scenarios like asynchronous replication or network partitions data can be lost even after client acknowledgment; using the WAIT command, configuring node timeout, and understanding master‑slave election can reduce but not fully eliminate these risks.

Java High-Performance Architecture

Nov 29, 2019

Why Redis Cluster Can Lose Data and How to Mitigate It

Redis Cluster does not guarantee strong consistency; in certain special scenarios, even if the client receives a write acknowledgment, data may still be lost.

Scenario 1: Asynchronous Replication

client writes to master B

master B replies OK

master B synchronizes to slaves B1, B2, B3

Master B replies to the client without waiting for confirmations from B1, B2, B3. If the master crashes before the slaves finish syncing, one of the slaves may be elected master and the previously written data is lost.

The wait command can improve data safety in this scenario. wait blocks the current client until the previous write operation has been successfully replicated to a specified number of slaves.

Using wait can increase safety but does not guarantee strong consistency, because a slave that has not yet completed synchronization might still be elected master.

Scenario 2: Network Partition

Six nodes A, B, C, A1, B1, C1 (three masters and three slaves) and a client Z1.

After a network partition, two zones are formed: A, C, A1, B1, C1 and B Z1.

Client Z1 can still write to B. If the partition is short-lived, the cluster resumes normal operation. If the partition persists, B1 becomes the master in its partition, and the data written by Z1 to B is lost.

The maximum window (maximum time window) can reduce data loss by limiting the total number of writes from Z1 to B.

After a certain period, the majority side of the partition will hold an election, a slave becomes master, and the minority side's master will refuse to accept write requests.

This time amount is very important and is called the node expiration time .

When a master reaches the expiration time, it is considered faulty, enters an error state, stops receiving write requests, and can be replaced by a slave.

Summary

Redis Cluster does not guarantee strong consistency and has data‑loss scenarios:

Asynchronous replication – the master writes successfully, but before slaves finish syncing, the master crashes, a slave becomes master, and data is lost. The wait command can switch to synchronous replication, but it cannot fully guarantee no data loss and impacts performance.

Network partition – after a partition, a master continues to accept writes; when the partition heals, that master may become a slave, causing previously written data to be lost. Setting a node expiration time can limit the amount of writes a master accepts during a partition, reducing data‑loss impact.

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.