Optimizing High‑Concurrency HttpClient Usage in Java: Reducing Latency from 250 ms to 80 ms
This article details how to dramatically improve the performance of a million‑level QPS Java service by reusing a singleton HttpClient, enabling keep‑alive connection pooling, and handling responses efficiently, cutting average request time from 250 ms to about 80 ms.
1. Analysis
The original implementation created a new HttpClient and HttpPost for every request, opened a fresh TCP connection each time, and duplicated the response entity into a string, leading to high latency (≈250 ms) and frequent thread‑exhaustion alerts.
Key problems identified:
Repeated creation of thread‑safe HttpClient instances.
Repeated TCP three‑way handshakes and teardowns.
Unnecessary copying of HttpEntity content, causing extra memory usage.
2. Implementation
Three main actions were taken:
Use a singleton HttpClient.
Enable connection reuse with a pooling manager and a keep‑alive strategy.
Process responses via a ResponseHandler to avoid manual stream handling.
2.1 Keep‑Alive Strategy
ConnectionKeepAliveStrategy myStrategy = new ConnectionKeepAliveStrategy() {
@Override
public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
HeaderElementIterator it = new BasicHeaderElementIterator(
response.headerIterator(HTTP.CONN_KEEP_ALIVE));
while (it.hasNext()) {
HeaderElement he = it.nextElement();
String param = he.getName();
String value = he.getValue();
if (value != null && param.equalsIgnoreCase("timeout")) {
return Long.parseLong(value) * 1000;
}
}
return 60 * 1000; // default 60 seconds
}
};2.2 Pooling Connection Manager
PoolingHttpClientConnectionManager connectionManager = new PoolingHttpClientConnectionManager();
connectionManager.setMaxTotal(500);
connectionManager.setDefaultMaxPerRoute(50); // adjust per business needs2.3 Building the HttpClient
httpClient = HttpClients.custom()
.setConnectionManager(connectionManager)
.setKeepAliveStrategy(myStrategy)
.setDefaultRequestConfig(RequestConfig.custom()
.setStaleConnectionCheckEnabled(true)
.build())
.build();A background thread can periodically close expired or idle connections:
public static class IdleConnectionMonitorThread extends Thread {
private final HttpClientConnectionManager connMgr;
private volatile boolean shutdown;
public IdleConnectionMonitorThread(HttpClientConnectionManager connMgr) {
super();
this.connMgr = connMgr;
}
@Override
public void run() {
try {
while (!shutdown) {
synchronized (this) {
wait(5000);
connMgr.closeExpiredConnections();
connMgr.closeIdleConnections(30, TimeUnit.SECONDS);
}
}
} catch (InterruptedException ex) {
// terminate
}
}
public void shutdown() {
shutdown = true;
synchronized (this) { notifyAll(); }
}
}2.4 Response Handling
Using a ResponseHandler automatically consumes the entity, eliminating manual stream closing:
public <T> T execute(final HttpHost target, final HttpRequest request,
final ResponseHandler<? extends T> responseHandler, final HttpContext context)
throws IOException, ClientProtocolException {
Args.notNull(responseHandler, "Response handler");
final HttpResponse response = execute(target, request, context);
try {
return responseHandler.handleResponse(response);
} finally {
final HttpEntity entity = response.getEntity();
EntityUtils.consume(entity);
}
}3. Additional Configurations
Timeout settings:
HttpParams params = new BasicHttpParams();
int CONNECTION_TIMEOUT = 2 * 1000; // 2 seconds
int SO_TIMEOUT = 2 * 1000; // 2 seconds
Long CONN_MANAGER_TIMEOUT = 500L;
params.setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, CONNECTION_TIMEOUT);
params.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, SO_TIMEOUT);
params.setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, CONN_MANAGER_TIMEOUT);
params.setBooleanParameter(CoreConnectionPNames.STALE_CONNECTION_CHECK, true);
httpClient.setHttpRequestRetryHandler(new DefaultHttpRequestRetryHandler(0, false));If an Nginx reverse proxy is used, configure keepalive_timeout, keepalive_requests on the client side and keepalive on the upstream side.
After applying these changes, the average request latency dropped from ~250 ms to ~80 ms, and the service no longer triggered thread‑exhaustion alarms.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
