Why Does a Redis Distributed Lock Expire Early? Time Jumps, GC, and Safer Alternatives

In high‑concurrency Java services, a Redis lock can appear to expire before its TTL because JVM Stop‑The‑World pauses freeze the application clock while Redis continues counting down, leading to lock loss and data races unless mitigated with watchdogs or optimistic locking at the database level.

IT Services Circle
IT Services Circle
IT Services Circle
Why Does a Redis Distributed Lock Expire Early? Time Jumps, GC, and Safer Alternatives

Problem Overview

When using Redis for distributed locking, developers usually set a TTL to avoid deadlocks caused by crashes. However, under heavy load a lock may be taken by another thread even though the original TTL has not elapsed, resulting in dirty data.

Typical Distributed‑Lock Pseudocode

// 1. Acquire lock with 10‑second TTL
if (redis.setnx("lock_key", "thread_A", 10s)) {
    try {
        // 2. Execute business logic (≈200 ms)
        doBusiness();
    } finally {
        // 3. Release lock (normally with ownership check)
        redis.del("lock_key");
    }
}

This works in the vast majority of cases, but the rare 0.01 % scenario described below can break it.

Time‑Jump Issue Caused by JVM Stop‑The‑World (STW) GC

During a long Full GC or a brief host pause, the JVM stops all application threads. The JVM’s internal clock freezes, but Redis continues to decrement the key’s TTL. The timeline looks like this:

(0 s) Thread A acquires the lock, TTL = 10 s.

(0.1 s) Thread A starts doBusiness() and runs only a few lines.

(0.2 s) A long Full GC begins; the JVM pauses for several seconds.

(10.2 s) Redis sees the TTL expire and deletes lock_key.

(10.3 s) Thread B acquires the lock because the key no longer exists.

(12 s) GC finishes, Thread A wakes up, unaware that 12 s have passed, and continues to write to the database while still holding the logical lock.

The result is two threads updating the same data concurrently – the lock has effectively vanished.

Extending the TTL Is Not a Real Fix

Side effects: if the service crashes, the lock may remain held for the extended period, causing a prolonged outage.

Uncontrollable: you cannot predict how long a future STW pause or network delay will be.

Watchdog (Redisson) Mechanism

Redisson implements a “watchdog” that automatically renews the lock before it expires.

Thread A acquires the lock with an initial TTL of 30 s.

Redisson starts a background daemon thread.

Every 10 s (≈ TTL/3) the daemon checks whether Thread A still holds the lock.

If the lock is still owned, the daemon extends the TTL back to 30 s.

As long as the process remains alive, the lock will not expire, even across GC pauses.

Watchdog Limitations in Extreme Cases

In a “black‑swan” scenario where a Full GC lasts longer than the watchdog’s renewal interval, the daemon itself is paused. The lock expires in Redis, Thread B acquires it, and both threads may write conflicting data. This shows that a watchdog cannot guarantee 100 % mutual exclusion for critical financial workloads.

Ultimate Solution: Optimistic Lock (Fencing Token)

To achieve true safety, combine the Redis lock with a database‑level optimistic lock (also called a fencing token). The workflow is:

When acquiring the Redis lock, Redis returns an incrementing token (e.g., 33).

The application includes this token in every database update.

The update succeeds only if the token is still the latest, otherwise it affects zero rows.

Example SQL:

UPDATE account SET money = 100 WHERE id = 1 AND current_token < 33;

Or a classic version‑based optimistic lock:

UPDATE account SET money = 100, version = version + 1
WHERE id = 1 AND version = old_version;

If another thread has already updated the row with a higher token, the WHERE clause fails and the application can retry or abort.

Conclusion

The lock appears to expire early because the JVM’s pause stops the application’s notion of time while Redis’s TTL keeps running. For ordinary services the Redisson watchdog is sufficient, but for high‑value financial operations you must also protect the critical section with a database‑level optimistic lock to avoid data corruption.

backendRedisDistributed LockOptimistic LockWatchdogJVM GC
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.