Why My Java API Freezes: Diagnosing Redis Connection Pool Blocking and Fixes
An internal sandbox API kept hanging for a week, prompting a deep dive with top, jstack, and Arthas that revealed Redis connection pool threads stuck in wait, leading to configuration tweaks and proper connection release to restore service stability.
Problem Description
The internal sandbox environment experienced a week‑long API freeze where all API calls returned no response.
Initial Investigation
Restarting the application temporarily resolved the issue, but the problem recurred more frequently, prompting a deeper investigation.
Local IDE showed no errors; database and Redis were normal, so the suspicion shifted to the sandbox machines.
Server Inspection
SSH into the server and run top:
The machine appeared healthy, but the investigation continued by examining JVM thread stacks.
Identify Resource‑Intensive Threads
Run top -H -p 12798 to find the most CPU‑consuming threads:
JStack Analysis
Using jstack on the problematic process revealed several threads stuck in lock state, but no business‑related code was visible.
Inspect Redis Connection Code
/**
* Returns a Jedis instance to be used as a Redis connection. The instance can be newly created or retrieved from a pool.
* @return Jedis instance ready for wrapping into a RedisConnection.
*/
protected Jedis fetchJedisConnector() {
try {
if (usePool && pool != null) {
return pool.getResource();
}
Jedis jedis = new Jedis(getShardInfo());
// force initialization (see Jedis issue #82)
jedis.connect();
return jedis;
} catch (Exception ex) {
throw new RedisConnectionFailureException("Cannot get Jedis connection", ex);
}
}After obtaining a connection, the thread entered a waiting state.
Pool Configuration Issue
public T getResource() {
try {
return internalPool.borrowObject();
} catch (Exception e) {
throw new JedisConnectionException("Could not get a resource from the pool", e);
}
return internalPool.borrowObject();
}The code loops indefinitely when borrowMaxWaitMillis < 0, indicating a missing configuration for the maximum wait time.
Arthas Thread Inspection
Many http-nio threads were in WAITING state, indicating that Tomcat request threads were blocked.
Root Cause Identification
The Redis connection acquisition code was holding onto connections without releasing them back to the pool. The call stringRedisTemplate.getConnectionFactory().getConnection() rented a connection that was never returned.
Cursor c = stringRedisTemplate.getConnectionFactory().getConnection().scan(options);
while (c.hasNext()) {
// processing
}Because the connection was not released, the pool eventually exhausted, causing API timeouts and 500 errors.
Fix Implemented
Configure the Redis pool with a proper max‑wait timeout and ensure connections are released.
JedisConnectionFactory jedisConnectionFactory = new JedisConnectionFactory();
JedisPoolConfig config = new JedisPoolConfig();
config.setMaxWaitMillis(2000);
// other config settings
jedisConnectionFactory.afterPropertiesSet();After redeploying, the issue re‑appeared, and logs showed RedisConnectionFailureException: Cannot get Jedis connection.
org.springframework.data.redis.RedisConnectionFailureException: Cannot get Jedis connection; nested exception is redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.fetchJedisConnector(JedisConnectionFactory.java:140)
...The problematic code was renting a connection without returning it:
Cursor c = stringRedisTemplate.getConnectionFactory().getConnection().scan(options);
while (c.hasNext()) {
// ...
}Correct usage involves executing Redis commands via a callback that automatically releases the connection, e.g.:
stringRedisTemplate.execute(new RedisCallback<Cursor>() {
@Override
public Cursor doInRedis(RedisConnection connection) throws DataAccessException {
return connection.scan(options);
}
});Or manually releasing the connection after use:
RedisConnectionUtils.releaseConnection(conn, factory);Additionally, avoid using the KEYS command in production and configure the Redis pool appropriately to prevent silent deadlocks.
Architect's Tech Stack
Java backend, microservices, distributed systems, containerized programming, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
