Databases 12 min read

How a Redis Memory Upgrade Triggered Data Loss: Sentinel Failover Lessons

A recent Redis deployment faced memory expansion, a master‑slave switch, and unexpected data loss when the new master entered read‑only mode, prompting a deep dive into sentinel behavior, maxmemory settings, and replica‑ignore‑maxmemory nuances to prevent similar failures.

dbaplus Community
dbaplus Community
dbaplus Community
How a Redis Memory Upgrade Triggered Data Loss: Sentinel Failover Lessons

Background

A Redis service with a single‑master, single‑slave topology needed a memory upgrade. The upgrade required host restarts, and during a manual master‑slave switch the new master lost a large portion of data and entered a read‑only state for several minutes.

Redis High‑Availability Architecture

The cluster uses a Sentinel quorum (2n+1 nodes) to monitor the master and its replica. Sentinels poll each Redis instance; if a majority marks a node as subjectively down, the node is considered objectively down and the following actions are taken:

If the down node is the master, a healthy replica is promoted to master.

If the down node is a replica, it is removed from the replica set.

When an objectively down node recovers, a majority of Sentinels reintegrate it as a replica. Clients discover the current master by querying Sentinel instead of hard‑coding addresses.

Redis Sentinel architecture
Redis Sentinel architecture

Memory Expansion Procedure

Upgrade the memory of host B (initial replica) and restart the host.

Verify that the Redis instance on B resynchronizes with the master:

Check the Redis log for errors.

Run INFO KEYSPACE on both nodes and confirm identical key counts per DB.

Write a test key on the master and read it back from the replica.

Execute SENTINEL FAILOVER <em>mastername</em> to promote B to master and demote A to replica.

Upgrade the memory of host A (original master), restart it, and repeat the verification steps for the new master/replica pair.

If all checks succeed, the memory upgrade is considered complete.

Data Loss After Failover

Immediately after step 3, the key count dropped from several million to a few hundred thousand, and the new master became read‑only for about ten minutes. The root cause was an unexpectedly low maxmemory setting on the newly promoted master.

The replica B had been started with maxmemory 3GB because a previous CONFIG SET maxmemory change was not persisted to redis.conf. While operating as a replica, Redis ignored the limit (default behavior of replica-ignore-maxmemory), allowing it to hold >6 GB of data. After promotion, the limit became active, triggering eviction and read‑only mode.

Root‑Cause Analysis

The runtime CONFIG SET maxmemory change was not written back to the static configuration file; after a restart the old 3 GB limit was reloaded.

Since Redis 5.0, the replica-ignore-maxmemory option defaults to yes, causing replicas to ignore maxmemory and rely on the master for eviction.

When the replica was promoted, replica-ignore-maxmemory no longer applied, so the 3 GB limit caused immediate eviction of keys, resulting in data loss.

The eviction policy was volatile‑lru, which evicts only keys with an expiration time, leaving non‑expiring keys in memory even after the limit was exceeded.

Recommendations

When changing maxmemory at runtime, also persist the change to the configuration file (e.g., run CONFIG REWRITE) to keep both configurations synchronized.

Consider setting maxmemory 0 (no limit) if the workload permits, or explicitly set replica-ignore-maxmemory no on replicas that may be promoted.

Verify the eviction policy aligns with the desired data‑retention behavior; volatile‑lru will not evict persistent keys.

Understanding the distinction between volatile runtime settings and persisted configuration, as well as the effect of replica-ignore-maxmemory, is essential for safe Redis failover and memory upgrades.

References

https://redis.io/docs/management/replication/

https://www.cnblogs.com/buttercup/p/14051301.html

https://zhuanlan.zhihu.com/p/151740247

https://www.cnblogs.com/AcAc-t/p/redis_master_switch_failure.html

https://zhuanlan.zhihu.com/p/320651292

https://redis.io/commands/config-rewrite/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilitysentinelfailovermaxmemoryMemory Upgradereplica-ignore-maxmemory
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.