Backend Development 11 min read

Optimizing High‑Concurrency HttpClient Usage in Java: Reducing Latency from 250 ms to 80 ms

This article details how to dramatically improve the performance of a million‑level QPS Java service by reusing a singleton HttpClient, enabling keep‑alive connection pooling, and handling responses efficiently, cutting average request time from 250 ms to about 80 ms.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Optimizing High‑Concurrency HttpClient Usage in Java: Reducing Latency from 250 ms to 80 ms

1. Analysis

The original implementation created a new HttpClient and HttpPost for every request, opened a fresh TCP connection each time, and duplicated the response entity into a string, leading to high latency (≈250 ms) and frequent thread‑exhaustion alerts.

Key problems identified:

Repeated creation of thread‑safe HttpClient instances.

Repeated TCP three‑way handshakes and teardowns.

Unnecessary copying of HttpEntity content, causing extra memory usage.

2. Implementation

Three main actions were taken:

Use a singleton HttpClient .

Enable connection reuse with a pooling manager and a keep‑alive strategy.

Process responses via a ResponseHandler to avoid manual stream handling.

2.1 Keep‑Alive Strategy

ConnectionKeepAliveStrategy myStrategy = new ConnectionKeepAliveStrategy() {
    @Override
    public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
        HeaderElementIterator it = new BasicHeaderElementIterator(
            response.headerIterator(HTTP.CONN_KEEP_ALIVE));
        while (it.hasNext()) {
            HeaderElement he = it.nextElement();
            String param = he.getName();
            String value = he.getValue();
            if (value != null && param.equalsIgnoreCase("timeout")) {
                return Long.parseLong(value) * 1000;
            }
        }
        return 60 * 1000; // default 60 seconds
    }
};

2.2 Pooling Connection Manager

PoolingHttpClientConnectionManager connectionManager = new PoolingHttpClientConnectionManager();
connectionManager.setMaxTotal(500);
connectionManager.setDefaultMaxPerRoute(50); // adjust per business needs

2.3 Building the HttpClient

httpClient = HttpClients.custom()
    .setConnectionManager(connectionManager)
    .setKeepAliveStrategy(myStrategy)
    .setDefaultRequestConfig(RequestConfig.custom()
        .setStaleConnectionCheckEnabled(true)
        .build())
    .build();

A background thread can periodically close expired or idle connections:

public static class IdleConnectionMonitorThread extends Thread {
    private final HttpClientConnectionManager connMgr;
    private volatile boolean shutdown;
    public IdleConnectionMonitorThread(HttpClientConnectionManager connMgr) {
        super();
        this.connMgr = connMgr;
    }
    @Override
    public void run() {
        try {
            while (!shutdown) {
                synchronized (this) {
                    wait(5000);
                    connMgr.closeExpiredConnections();
                    connMgr.closeIdleConnections(30, TimeUnit.SECONDS);
                }
            }
        } catch (InterruptedException ex) {
            // terminate
        }
    }
    public void shutdown() {
        shutdown = true;
        synchronized (this) { notifyAll(); }
    }
}

2.4 Response Handling

Using a ResponseHandler automatically consumes the entity, eliminating manual stream closing:

public
T execute(final HttpHost target, final HttpRequest request,
        final ResponseHandler
responseHandler, final HttpContext context)
        throws IOException, ClientProtocolException {
    Args.notNull(responseHandler, "Response handler");
    final HttpResponse response = execute(target, request, context);
    try {
        return responseHandler.handleResponse(response);
    } finally {
        final HttpEntity entity = response.getEntity();
        EntityUtils.consume(entity);
    }
}

3. Additional Configurations

Timeout settings:

HttpParams params = new BasicHttpParams();
int CONNECTION_TIMEOUT = 2 * 1000; // 2 seconds
int SO_TIMEOUT = 2 * 1000; // 2 seconds
Long CONN_MANAGER_TIMEOUT = 500L;
params.setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, CONNECTION_TIMEOUT);
params.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, SO_TIMEOUT);
params.setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, CONN_MANAGER_TIMEOUT);
params.setBooleanParameter(CoreConnectionPNames.STALE_CONNECTION_CHECK, true);
httpClient.setHttpRequestRetryHandler(new DefaultHttpRequestRetryHandler(0, false));

If an Nginx reverse proxy is used, configure keepalive_timeout , keepalive_requests on the client side and keepalive on the upstream side.

After applying these changes, the average request latency dropped from ~250 ms to ~80 ms, and the service no longer triggered thread‑exhaustion alarms.

backendJavaperformancekeepaliveHttpClientConnectionPooling
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.