Backend Development 13 min read

Troubleshooting Dubbo Thread Pool Exhaustion: A Redis Performance Optimization Case Study

The case study details how a high‑traffic Dubbo service handling 1.8 billion daily requests suffered periodic circuit‑breaks due to thread‑pool exhaustion, traced to a cache‑bypass bug, Redis setex spikes, and an improperly warmed commons‑pool2 connection pool, and resolved by fixing the bug, scaling Redis, and tuning or downgrading the pool configuration to enable pre‑warming via minEvictableIdleTimeMillis.

vivo Internet Technology
vivo Internet Technology
vivo Internet Technology
Troubleshooting Dubbo Thread Pool Exhaustion: A Redis Performance Optimization Case Study

This article documents the troubleshooting and optimization journey of a high-traffic Dubbo interface at vivo, handling 1.8 billion daily requests that experienced circuit breaking due to thread pool exhaustion at fixed time intervals.

Problem Background: The upstream caller reported that a specific Dubbo interface was being short-circuited at fixed times daily, throwing exceptions indicating the provider's Dubbo thread pool was exhausted. Error requests reached 940,000 per day.

Root Cause Analysis: Traffic analysis revealed fixed time spikes correlating with error times. The investigation identified multiple bottlenecks: (1) A code bug where data was fetched directly from Redis instead of local cache, causing Redis traffic to double; (2) Redis setex operations causing 99th percentile spikes due to connection pool issues; (3) The connection pool (commons-pool2) not warming up properly during cold start.

Solutions Implemented: First, fixed the code bug that bypassed local cache, reducing Redis traffic. Then attempted Redis scaling from 6 to 8 masters, which showed minimal improvement. Finally, optimized client-side connection pool parameters (maxWaitMillis, blockWhenExhausted, etc.) and upgraded commons-pool2 from 2.6.2 to 2.4.2 to enable proper connection pool warm-up via the startEvictor mechanism with minEvictableIdleTimeMillis configuration.

Key Findings: Different commons-pool2 versions have different cold-start behaviors; the minEvictableIdleTimeMillis parameter is required for connection pool pre-warming to work. Default connection pool parameters are insufficient for high-traffic scenarios and require tuning.

performance optimizationRedisDubboConnection Pooltroubleshootingthread poolcircuit breakingCommons Pool2
vivo Internet Technology
Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.