Databases 24 min read

Detailed Walkthrough of Druid Connection Pool Lifecycle and Configuration

This article provides an in‑depth analysis of Druid's connection‑pool lifecycle, covering initialization, connection acquisition, validation, eviction, keep‑alive, and recycling processes, while offering performance‑related configuration recommendations and code examples for Java developers.

Architect
Architect
Architect
Detailed Walkthrough of Druid Connection Pool Lifecycle and Configuration

Overview

This article uses getConnection as the entry point to explore the lifecycle of a connection in Druid, dividing the overall workflow into several main processes.

Main Process 1: Acquiring a Connection

The acquisition flow starts with init to initialize the pool, then runs each filter in the responsibility chain, and finally calls getConnectionDirect to obtain the real connection. If testOnBorrow is enabled, the connection is validated on every borrow, which can impact performance because the MySQL long‑connection keep‑alive defaults to 8 hours but may be configured shorter.

If testOnBorrow is false, Druid performs testWhileIdle checks based on timeBetweenEvictionRunsMillis (default 60 s). The idle time of each connection is compared with this interval to decide whether to run a validation.

When a validation fails, testConnectionInternal triggers discardConnection , causing the connection to be recycled and the retry logic to start again. The retry count can be limited with notFullTimeoutRetryCount , and the maximum wait time is roughly 2 × maxWait .

Special Note ①

For best performance, keep testOnBorrow disabled and rely on the default idle‑check interval (60 s). Verify the MySQL server's long‑connection keep‑alive time; if it is less than 60 s, adjust timeBetweenEvictionRunsMillis accordingly.

Special Note ②

To avoid unnecessary pool expansion, set minIdle and maxActive to the same value for high‑QPS services, and enable keepAlive . For low‑QPS admin back‑ends, a smaller minIdle can reduce wasted connections.

Main Process 2: Initializing the Pool

If the pool is not yet initialized (checked via the inited flag), init is called. This method creates a global re‑entrant lock , loads filters via SPI (filters must be annotated with @AutoLoad ), and allocates three arrays sized to maxActive for connections , evictConnections , and keepAliveConnections . It then creates initialSize connections and starts two daemon threads: one for adding connections and another for removing idle connections.

Special Note ①

Pre‑warming the pool (calling init or getConnection early) avoids the costly initialization path on the first real request, which can cause long latency under high concurrency.

Special Note ②

The global lock creates two Condition objects: empty (used by the add‑connection daemon) and notEmpty (used by threads waiting for a connection). When the pool is exhausted, business threads block on notEmpty while the daemon thread wakes, creates a new connection, and signals notEmpty .

Responsibility Chain (Process 1.1)

Each DruidAbstractDataSource holds a filters list. When getConnection is invoked, the FilterChain executes the corresponding dataSource_getConnection method of each filter before finally calling getConnectionDirect .

Process 1.2: Getting a Connection from the Pool

getConnectionInternal first tries to take a connection from the pool (O(1) operation). If none are available, it wakes the add‑connection daemon, then waits on notEmpty . The wait uses awaitNanos with an initial timeout of maxWait , decreasing after each wake‑up. If the timeout expires, the method returns null and the retry logic in Process 1 is triggered.

How Druid Limits Blocking Threads

If maxWaitThreadCount is set to a positive value, Druid checks the current number of threads waiting on notEmpty (tracked by notEmptyWaitThreadCount ). When the limit is exceeded, it throws an SQLException instead of blocking:

if (maxWaitThreadCount > 0 && notEmptyWaitThreadCount >= maxWaitThreadCount) {
    connectErrorCountUpdater.incrementAndGet(this);
    throw new SQLException("maxWaitThreadCount " + maxWaitThreadCount + ", current wait Thread count " + lock.getQueueLength());
}

Enabling this setting is rarely needed; it is better to tune pool size, maxActive , and ensure connections are closed promptly.

Process 1.3: Connection Validation

The init‑checker is created during pool initialization. For MySQL it prefers the driver’s ping method; otherwise it falls back to executing SELECT 1 . The testConnectionInternal method invokes the checker’s isValidConnection to determine if a connection is still alive.

Process 1.4: Discarding a Bad Connection

If validation fails, the connection’s activeCount is decremented and the underlying driver Connection is closed. The wrapper DruidPooledConnection may instead trigger a recycle operation.

Process 3: Add‑Connection Daemon

This daemon thread runs mostly idle. When the pool is exhausted, it is signaled via empty.signal , creates new connections, and then signals waiting business threads on notEmpty . It follows a producer‑consumer model and respects maxActive .

Process 4: Eviction & Keep‑Alive Daemon

The daemon periodically examines idle connections. It uses minEvictableIdleTimeMillis (default 30 min) and maxEvictableIdleTimeMillis (default 7 h) to decide which connections to move to evictConnections . Connections whose idle time exceeds keepAliveBetweenTimeMillis (default 60 s) are placed in keepAliveConnections for validation.

After building the two queues, the pool compresses the remaining connections, discards those in evictConnections , and validates those in keepAliveConnections , discarding any that fail.

Process 4.2: Abandoned Connection Recovery

When removeAbandoned is true, a separate thread scans activeConnections for connections that have been borrowed for too long, moves them to abandonedList , and closes them to prevent memory leaks.

Process 5: Recycling a Connection

When a client calls close on a DruidPooledConnection , the wrapper invokes recycle . The method may roll back any uncommitted transaction, reset the connection state, optionally run testOnReturn , and finally puts the connection back at the tail of the connections array, updating lastActiveTimeMillis .

If testOnReturn is disabled (the default) and keepAlive is false, the next idle‑check interval is based on the difference between the current time and lastActiveTimeMillis compared with timeBetweenEvictionRunsMillis .

Conclusion

The article has outlined the complete lifecycle of a Druid connection—from pool initialization, through acquisition, validation, eviction, and recycling—highlighting key configuration parameters that affect performance and reliability.

JavaperformanceDatabaseConnection PoolDruidThread Management
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.