Why Netty Is the Ideal Choice for High‑Performance HTTP Clients and Servers

This article explains how using Netty for HTTP handling—covering codec setup, HEAD request support, ByteBuf reference management, connection pooling, and full asynchronous processing—delivers superior performance, lower GC pressure, and better scalability compared to traditional servlet containers.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Why Netty Is the Ideal Choice for High‑Performance HTTP Clients and Servers

Our gateway is now fully based on Netty to implement the HTTP protocol, both client and server. Netty supports off‑heap memory and manual reference counting, reducing GC pressure and enabling extreme optimization, making it the preferred choice for an HTTP client.

Key features of our Netty‑based gateway service include:

Codec (encoding/decoding)

Reference‑count release

HEAD request handling

Connection pool

Connection reuse

Netty HTTP server

Fully asynchronous processing

HTTP Codec

Many articles show Netty HTTP codec only as a demo and lack production‑grade experience.

channelPipeline.addLast("idleStateHandler", new SouthgateReadIdleStateHandler(readIdleSec, 0, 0, TimeUnit.MILLISECONDS));
channelPipeline.addLast("httpEncode", new HttpRequestEncoder());
//channelPipeline.addLast("httpDecode",new HttpResponseDecoder());
channelPipeline.addLast("httpDecode", new SouthgateHttpResponseDecoder());
channelPipeline.addLast("aggregator", new HttpObjectAggregator(MAX_CONTENT_LENGTH));

httpEncode and httpDecode are essential; we also add an idle‑state handler to close idle connections and free resources.

We use HttpObjectAggregator to combine chunked HTTP messages into a full response, so handlers see a FullHttpResponse without dealing with LastHttpContent.

HEAD Request

Netty’s HttpRequestEncoder does not cache the request method, causing HEAD requests to be mis‑parsed. The official solution is HttpClientCodec, which caches the method and returns an empty LastHttpContent for HEAD.

if (msg instanceof HttpRequest && !done) {
    queue.offer(((HttpRequest) msg).method());
}

During decode we retrieve the method from the queue:

HttpMethod method = queue.poll();

Because HttpClientCodec ties a method to a single connection, we rewrote HttpResponseDecoder’s isContentAlwaysEmpty method to support connection reuse.

ByteBuf Release and Reference Counting

Reference Counting

Netty ByteBufs are allocated from a pool with an initial reference count of 1. Some operations must manually release the buffer; Netty automatically releases write requests after encoding.

try {
    encode(ctx, cast, out);
} finally {
    ReferenceCountUtil.release(cast);
}

Inbound handlers must also release buffers explicitly, using ReferenceCountUtil.release(...). When using SimpleChannelInboundHandler, Netty releases automatically, so you must retain() the message if you pass it to another thread.

Retries require retain() before re‑sending; otherwise a zero‑refCnt exception occurs.

((FullHttpRequest) httpRequest).retain(event.getMaxRedoCount());

After a successful send, release any remaining references:

int refCnt = ((FullHttpResponse) httpResponse).refCnt();
if (refCnt > 0) {
    ReferenceCountUtil.release(httpResponse, refCnt);
}

PoolThreadCache

Netty enables a thread‑local cache for buffer allocation. To avoid memory leaks when releasing buffers in a different thread, we disable the cache:

System.setProperty("io.netty.recycler.maxCapacity","0");
System.setProperty("io.netty.allocator.tinyCacheSize","0");
System.setProperty("io.netty.allocator.smallCacheSize","0");
System.setProperty("io.netty.allocator.normalCacheSize","0");

Connection Pool

HTTP is a single‑use protocol; without a pool, high concurrency can exhaust connections.

Netty’s built‑in pool provides fixed connections, a waiting queue, timeout handling, and fallback strategies (timeout or create new connection).

final SouthgateChannelPool fixedChannelPool = new SouthgateChannelPool(
    bootstrap,
    nettyClientChannelPoolHandler,
    new ChannelHealthChecker() {
        @Override
        public Future<Boolean> isHealthy(Channel channel) {
            EventLoop loop = channel.eventLoop();
            return channel.isOpen() && channel.isActive() && channel.isWritable()
                ? loop.newSucceededFuture(Boolean.TRUE)
                : loop.newSucceededFuture(Boolean.FALSE);
        }
    },
    FixedChannelPool.AcquireTimeoutAction.NEW,
    nettyConfig.getAcquireConnectionTimeout(),
    nettyConfig.getMaxConnections(),
    nettyConfig.getMaxPendingAcquires(),
    true,
    hostProfile);

Health checks ensure the channel is open, active, and writable. Write‑buffer watermarks control isWritable:

bootstrap.option(ChannelOption.WRITE_BUFFER_WATER_MARK,
    new WriteBufferWaterMark(LOW_WATER_MARK, HIGH_WATER_MARK));

Connection Reuse

HTTP’s lack of a request ID makes each connection exclusive; reuse reduces the number of connections needed for high throughput.

Tomcat closes idle connections after 20 seconds or after a configurable maxKeepAliveRequests (default 100). By adding a unique request ID in the header, we can reuse connections similarly to RPC.

Tomcat’s NIO processing is synchronous, so a single connection processes requests sequentially, which can cause back‑pressure and buffer overflow.

Why Use Netty at the Ingress

High performance: object pool, memory pool, edge‑triggered mode, epoll bug handling.

Off‑heap memory reduces GC pressure.

Separate read/write buffers avoid extra copies compared with Tomcat’s heap buffers.

Netty can close excess connections early, preventing client timeouts under massive load.

Fully Asynchronous Design

The gateway must be asynchronous to handle backend services with varying latency without blocking incoming requests.

Tomcat as Container

When using Tomcat, asynchronous support (Servlet 3.0) is required to defer response until the backend result arrives.

Netty Implementation

Netty’s event‑driven model provides a non‑blocking HTTP server and client, with a custom thread pool for async processing.

Industry leaders (e.g., Netflix Zuul 2) adopt a similar model, sharing the same event‑loop pool for ingress and outbound calls to minimize context switches, while guarding against blocking code.

Conclusion

Our current gateway is built on HTTP/1.1; we are adding HTTP/2 support and exploring custom protocols for the future.

Thread model diagram
Thread model diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

javaConnection PoolNettyHTTPByteBuf
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.