Databases 8 min read

Why Did Our Redis Calls Take 1.2 Seconds? A Deep Dive into JedisPool Tuning

During a March 2023 load test the team discovered Redis request latency soaring to 1.2 seconds, traced the issue to mis‑configured JedisPool parameters and excessive connection churn, and resolved it by adjusting pool settings, dramatically improving P50, P90 and P99 response times.

dbaplus Community
dbaplus Community
dbaplus Community
Why Did Our Redis Calls Take 1.2 Seconds? A Deep Dive into JedisPool Tuning

Background

During an online load test on 2023‑03‑08 the service exhibited severe latency spikes (P50 ≈ 400 ms, P90 ≈ 1.2 s, P99 ≈ 2 s). Detailed tracing identified Redis access latency of about 1.2 seconds as the dominant factor.

Why Was Redis Access So Slow?

Server side : The Redis instance ( redis_amber_master_4xlarge_multithread, 16 CPU, 32 GB RAM, 480 GB SSD) can handle up to 240 k QPS and 30 k connections. Observed QPS during the peak was far below these limits.

Network : Bandwidth was not saturated and NIC retransmission rates were normal.

Client side : The remaining suspect, leading to a deeper client‑side investigation.

Client‑Side Diagnosis

JVM Full GC : ARMS monitoring showed increased YGC counts but no Full GC (FGC) stop‑the‑world pauses.

JedisPool configuration : A heap dump revealed two critical metrics: maxBorrowWaitTimeMills was set too high, causing threads to wait up to 1200 ms for a connection from the pool. createdCount =11555 and destroyedCount =11553, indicating that connections were being created and destroyed repeatedly because the max-idle limit caused idle connections to be discarded.

These counters are maintained by the singleton JedisPool object for the lifetime of the JVM, so a post‑mortem heap dump was sufficient for analysis.

How JedisPool Works

JedisPool extends redis.clients.jedis.JedisPoolConfig, which maps to Apache Commons ObjectPool2. Important configuration properties (Spring Boot style) are:

spring.redis.jedis.pool.max-active=100</code>
<code>spring.redis.jedis.pool.max-idle=16</code>
<code>spring.redis.jedis.pool.time-between-eviction-runs-millis=30000</code>
<code>spring.redis.jedis.pool.min-idle=0</code>
<code>spring.redis.jedis.pool.test-while-idle=true</code>
<code>spring.redis.jedis.pool.num-tests-per-eviction-run=-1</code>
<code>spring.redis.jedis.pool.min-evictable-idle-time-millis=60000

Key behaviours: max-active (mapped to maxTotal) limits the total number of connections (idle + active).

When a connection is returned, if the idle count exceeds max-idle the connection is destroyed.

A background eviction thread runs every time-between-eviction-runs-millis (30 s by default) to perform health checks and evictions. min-idle =0 means the pool does not maintain a core size.

Root Cause: Pulse‑Style Request Pattern

During a traffic burst, 84 connections were created and destroyed within a short window (T2–T3). Each creation incurs a TCP handshake and authentication, causing large overhead and the observed latency spikes.

Desired Pool Behaviour

Set max-active, max-idle and min-idle to the same stable value so that the pool size is fixed and connections are not created on‑the‑fly.

Enable periodic health‑checks to drop stale or timed‑out connections.

Avoid destroying idle connections solely because they have been idle too long.

Production Configuration Adjustments

spring.redis.jedis.pool.max-active=500   // 4 machines × 500 = 2000, still far below Redis capacity</code>
<code>spring.redis.jedis.pool.max-idle=50</code>
<code>spring.redis.jedis.pool.time-between-eviction-runs-millis=30000   // keep‑alive check</code>
<code>spring.redis.jedis.pool.min-idle=500   // stable pool size</code>
<code>spring.redis.jedis.pool.test-while-idle=true</code>
<code>spring.redis.jedis.pool.num-tests-per-eviction-run=-1   // test all idle connections each run</code>
<code>spring.redis.jedis.pool.min-evictable-idle-time-millis=-1   // never evict solely by idle time

Verification

A repeat load test on 2023‑04‑13 with the same traffic model produced the following improvements: maxBorrowWaitTimeMills reduced by roughly 80 %. createdCount dropped from 11 555 to 500 (the initial pool size).

Business‑side latency improved dramatically: P50 and P90 fell by ~60 %, P99 by ~70 %.

These results confirm that the mis‑configured JedisPool was the root cause of the Redis latency spikes and that the tuned settings restore high‑performance Redis access.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance tuningLatencyJedisPool
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.