Why Did Our API Hang? Uncovering Redis Connection Pool Deadlocks in Spring

A Java Spring application repeatedly stalled due to Redis connection pool deadlocks, and the investigation walks through using system tools, JVM thread dumps, and Arthas to pinpoint the issue, then shows how proper pool configuration and safe Redis access prevent the API from freezing.

Architect's Tech Stack
Architect's Tech Stack
Architect's Tech Stack
Why Did Our API Hang? Uncovering Redis Connection Pool Deadlocks in Spring

Problem Overview

The internal sandbox environment showed API unresponsiveness for a week, with all APIs hanging. Initial restarts temporarily fixed the issue, but the frequency increased, prompting a deeper investigation.

Initial Investigation

SSH into the server and run top to check system load; the machine appeared normal. The next step was to examine JVM thread stacks.

Inspecting Threads

Executed top -H -p 12798 to find resource‑intensive threads, then used jstack 12798 | grep 12799 to locate threads in a lock state.

Discovering the Redis Blockage

Debugging revealed that many http-nio threads were in a waiting state, indicating that API requests were blocked while trying to obtain a Redis connection.

Analyzing Redis Connection Code

/**
 * Returns a Jedis instance to be used as a Redis connection. The instance can be newly created or retrieved from a * pool.
 */
protected Jedis fetchJedisConnector() {
    try {
        if (usePool && pool != null) {
            return pool.getResource();
        }
        Jedis jedis = new Jedis(getShardInfo());
        jedis.connect();
        return jedis;
    } catch (Exception ex) {
        throw new RedisConnectionFailureException("Cannot get Jedis connection", ex);
    }
}

The pool.getResource() call caused threads to wait indefinitely because the pool configuration lacked a proper maxWaitMillis setting.

public T getResource() {
    try {
        return internalPool.borrowObject();
    } catch (Exception e) {
        throw new JedisConnectionException("Could not get a resource from the pool", e);
    }
    return internalPool.borrowObject();
}

Further inspection of the pool's takeFirst method showed that when borrowMaxWaitMillis < 0, the code loops forever, confirming the missing timeout configuration.

public E takeFirst() throws InterruptedException {
    this.lock.lock();
    try {
        Object x;
        while ((x = this.unlinkFirst()) == null) {
            this.notEmpty.await();
        }
        return (E) x;
    } finally {
        this.lock.unlock();
    }
}

Fixing the Pool Configuration

Added a proper timeout to the Redis pool:

JedisConnectionFactory jedisConnectionFactory = new JedisConnectionFactory();
JedisPoolConfig config = new JedisPoolConfig();
config.setMaxWaitMillis(2000); // 2 seconds
jedisConnectionFactory.setPoolConfig(config);
jedisConnectionFactory.afterPropertiesSet();

After restarting the service, the issue reappeared, and the Tomcat access log showed many 500 errors caused by the following exception:

org.springframework.data.redis.RedisConnectionFailureException: Cannot get Jedis connection; nested exception is redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.fetchJedisConnector(JedisConnectionFactory.java:140)
    ...

Root Cause Analysis

The code stringRedisTemplate.getConnectionFactory().getConnection() obtained a Redis connection from the pool but never released it, leaving the connection in a non‑idle state and eventually exhausting the pool.

Cursor c = stringRedisTemplate.getConnectionFactory().getConnection().scan(options);
while (c.hasNext()) {
    // processing
}

Because the connection was not returned, subsequent requests blocked.

Recommended Practices

Instead of directly using the connection, wrap Redis operations in a RedisCallback so that Spring manages the connection lifecycle:

stringRedisTemplate.execute(new RedisCallback<Cursor>() {
    @Override
    public Cursor doInRedis(RedisConnection connection) throws DataAccessException {
        return connection.scan(options);
    }
});

Or explicitly release the connection after use:

RedisConnectionUtils.releaseConnection(conn, factory);

Avoid using the KEYS command in production and configure the Redis pool with reasonable limits to prevent silent deadlocks.

DebuggingJavaRedisSpringConnection PoolJedisArthas
Architect's Tech Stack
Written by

Architect's Tech Stack

Java backend, microservices, distributed systems, containerized programming, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.