Diagnosing Redis Connection Pool Blocking in a Spring Boot Application
The article details a week‑long investigation of unresponsive APIs in a sandbox environment, tracing the root cause to Redis connection pool blocking in a Spring Boot service, and presents step‑by‑step debugging, code analysis, configuration fixes, and best‑practice recommendations for proper Redis usage.
The author reports a week‑long issue where APIs in an internal sandbox environment became completely unresponsive, prompting a systematic investigation.
Initial Investigation
Using top and top -H -p 12798 the most resource‑intensive threads were identified, followed by jstack to examine thread stacks. The analysis revealed many threads stuck in lock states while attempting to obtain Redis connections.
Jedis Pool Code Examination
/**
* Returns a Jedis instance to be used as a Redis connection. The instance can be newly created or retrieved from a pool.
*/
protected Jedis fetchJedisConnector() {
try {
if (usePool && pool != null) {
return pool.getResource();
}
Jedis jedis = new Jedis(getShardInfo());
jedis.connect();
return jedis;
} catch (Exception ex) {
throw new RedisConnectionFailureException("Cannot get Jedis connection", ex);
}
}The call to pool.getResource() caused threads to wait indefinitely because the pool configuration lacked a maxWaitMillis setting, leading to endless loops in the pool’s borrowObject logic.
public T getResource() {
try {
return internalPool.borrowObject();
} catch (Exception e) {
throw new JedisConnectionException("Could not get a resource from the pool", e);
}
}Further inspection of the pool’s borrowObject method showed that when borrowMaxWaitMillis < 0 the thread blocks forever, confirming the missing timeout configuration.
Using Arthas for Thread Diagnosis
The author installed Alibaba’s Arthas and ran thread and thread -b commands, discovering that many http-nio Tomcat threads were in a waiting state, all blocked on the same Redis lock object.
Configuration Fix
By adding a proper timeout to the Jedis pool:
JedisPoolConfig config = new JedisPoolConfig();
config.setMaxWaitMillis(2000);
JedisConnectionFactory jedisConnectionFactory = new JedisConnectionFactory();
jedisConnectionFactory.setPoolConfig(config);
jedisConnectionFactory.afterPropertiesSet();the application was restarted, but the issue resurfaced, now producing 500 errors with stack traces indicating RedisConnectionFailureException caused by inability to obtain a Jedis connection.
Root Cause Identification
The problematic code used stringRedisTemplate.getConnectionFactory().getConnection() to obtain a raw Redis connection for a SCAN operation without ever releasing it, causing the pool to exhaust its resources:
Cursor c = stringRedisTemplate.getConnectionFactory().getConnection().scan(options);
while (c.hasNext()) {
// processing
}Because the connection was never returned, subsequent requests blocked indefinitely.
Recommended Practices
Instead of directly accessing the low‑level connection, the author suggests using Spring’s RedisCallback to execute commands and automatically manage resource release:
stringRedisTemplate.execute(new RedisCallback
() {
@Override
public Cursor doInRedis(RedisConnection connection) throws DataAccessException {
return connection.scan(options);
}
});Alternatively, after manual use, the connection should be released with:
RedisConnectionUtils.releaseConnection(conn, factory);The article concludes with advice to avoid the KEYS command, configure Redis pools with appropriate timeouts, and prefer higher‑level Spring abstractions to prevent hidden blocking issues.
Architect's Tech Stack
Java backend, microservices, distributed systems, containerized programming, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.