Druid Crashed in Production? How to Optimize the Spring Boot Connection Pool
The article explains why Druid can fail in a live Spring Boot service and provides a comprehensive, step‑by‑step optimization guide covering core pool parameters, monitoring setup, security hardening, leak detection, dynamic tuning, and best‑practice pitfalls to achieve stable, high‑performance database connections.
Environment Preparation
Use the latest stable Druid version (≥ 1.2.38) and exclude older versions in pom.xml:
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid-spring-boot-starter</artifactId>
<version>1.2.38</version>
</dependency>Core Connection‑Pool Parameter Tuning
Pool Size Control
initialSize : CPU cores / 2 (e.g., 2 for a 4‑core machine) to limit startup cost.
minIdle : CPU cores * 1.5 (e.g., 6 for 4 cores) but must stay below the database max_connections (MySQL default 151).
maxActive : single‑connection QPS * 1.2 (e.g., 100 QPS → 120). Do not exceed ~200 to avoid Too many connections errors.
Connection Lifecycle Management
maxWait : 3000 ms to prevent long thread blocking.
timeBetweenEvictionRunsMillis : 10000 ms (10 s) to speed up reclamation of idle connections.
minEvictableIdleTimeMillis : 60000 ms (1 min) for short‑lived HTTP requests.
validationQuery : lightweight SQL such as SELECT 1 (MySQL) or SELECT 1 FROM DUAL (Oracle) to avoid full‑table scans.
testWhileIdle =true (recommended), testOnBorrow =false, testOnReturn =false to balance validation overhead.
Monitoring System Construction (Key Optimization Point)
Enable StatFilter (SQL Statistics)
spring:
datasource:
druid:
stat-filter:
enabled: true
slow-sql-millis: 2000
merge-sql: true
log-slow-sql: trueRecords execution count, latency, affected rows and flags queries slower than 2 s.
Configure Web Monitoring Page
spring:
datasource:
druid:
web-stat-filter:
enabled: true
url-pattern: /*
exclusions: "*.js,*.gif,*.jpg,*.png,*.css,*.ico,/druid/*"
stat-view-servlet:
enabled: true
url-pattern: /druid/*
allow: 127.0.0.1
login-username: admin
login-password: 123456
reset-enable: falseThe page /druid/statView.html shows active/idle connections, waiting queue length, SQL statistics and URI call stats. In production restrict IP access and enable authentication.
Log Integration (ELK or Prometheus + Grafana)
ELK solution – route Druid logs to Logstash via logback-spring.xml:
<logger name="com.alibaba.druid.pool.DruidDataSource" level="DEBUG">
<appender-ref ref="LOGSTASH"/>
</logger>Prometheus + Grafana solution – add exporters:
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid-prometheus-exporter</artifactId>
<version>1.2.38</version>
</dependency>Expose metrics and build Grafana dashboards for pool utilization and slow‑SQL distribution.
Security Hardening Configuration
Prevent SQL Injection (WallFilter)
spring:
datasource:
druid:
filters: wall,stat,slf4j
wall:
enabled: true
delete-allow: false
update-allow: false
procedure-allow: false
config:
select-allow: trueIntercepts dangerous statements such as DROP TABLE and requires a whitelist for legitimate queries.
Password Encryption
Implement AES or SHA‑256 encryption for DB passwords. Example custom callback:
public class MyPasswordCallback extends DecryptPasswordCallback {
public MyPasswordCallback() {
super("your-encryption-key"); // replace with actual key
}
}Configure encrypted password in application.yml:
spring:
datasource:
druid:
url: jdbc:mysql://...
username: root
password: encryptedPassword
filters: stat,wall
connection-properties: config.decrypt=true;config.decrypt.key=myKeyDefend Against CC Attacks (Connection Rate Limiting)
spring:
datasource:
druid:
stat-filter:
enabled: true
max-sql-execution-count-per-ip-per-minute: 1000
max-sql-execution-count-per-uri-per-minute: 500Connection Leak Detection (Production Essential)
Enable leak detection to catch connections not closed in finally blocks:
spring:
datasource:
druid:
remove-abandoned: true
remove-abandoned-timeout: 300 # seconds
log-abandoned: trueWhen a borrowed connection exceeds remove-abandoned-timeout, Druid forcibly reclaims it and logs the stack trace for debugging. Suitable for long‑running transactions; short‑lived connections usually do not need it.
Advanced Optimization Techniques
Dynamic Pool Parameter Adjustment (Runtime Tuning)
Use the JMX interface or programmatic access to modify pool size on the fly, e.g., during a traffic surge:
@Autowired
private DataSource dataSource;
public void adjustPoolSize() {
if (dataSource instanceof DruidDataSource) {
DruidDataSource druid = (DruidDataSource) dataSource;
druid.setMaxActive(200); // increase max connections
druid.setMinIdle(50); // increase min idle
}
}After adjustment, monitor DB load to avoid sudden pressure spikes.
Connection Pre‑Warm (Cold‑Start Optimization)
spring:
datasource:
druid:
initial-size: 10 # overrides default 0
test-on-borrow: false # skip validation during pre‑creationTransaction Isolation Level Optimization
Set the isolation level according to business needs (default READ_COMMITTED):
spring:
datasource:
druid:
default-transaction-isolation: 2 # TRANSACTION_READ_COMMITTEDPitfall Avoidance Guide
Avoid over‑configuring maxActive; leave ~20 % headroom below the DB's max_connections limit.
Base every parameter change on real monitoring data (pool utilization, wait queue length).
Disable debug‑related features (e.g., log-abandoned=true) in production after verification.
Ensure Druid version compatibility with Spring Boot and the DB driver (e.g., MySQL 8.0 requires com.mysql.cj.jdbc.Driver).
Summary
Optimal Druid performance requires aligning pool settings with business load, continuously monitoring key metrics, hardening security, and iteratively tuning based on observed data. The workflow consists of:
Basic parameter tuning (capacity, lifecycle).
Monitoring system setup (SQL stats, connection state).
Security hardening (injection protection, leak detection).
Ongoing iteration driven by monitoring insights.
The goal is to balance connection utilization, performance stability, and security.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
