Backend Development 8 min read

Why Netty Beats Tomcat: IO Models, Zero‑Copy, Off‑Heap Memory & Object Pools

This article examines why Netty has become the preferred high‑performance server framework over Tomcat, covering Java I/O models, zero‑copy techniques, off‑heap memory usage, and Netty’s custom object‑pooling, and explains how these features enable handling thousands of concurrent connections efficiently.

Sanyou's Java Diary
Sanyou's Java Diary
Sanyou's Java Diary
Why Netty Beats Tomcat: IO Models, Zero‑Copy, Off‑Heap Memory & Object Pools

Introduction: Tomcat was once the standard web container but its concurrency is limited to 200‑400 connections, even with AIO. Netty has become the high‑performance server framework used in Dubbo, Vert.x, gateways, etc., and this article explains why.

IO Models

Java provides several I/O models: blocking I/O (BIO), non‑blocking NIO with multiplexing (select, poll, epoll), and asynchronous I/O (AIO). Blocking vs non‑blocking differs in whether the thread waits for the operation; synchronous vs asynchronous differs in whether the call returns immediately.

In NIO, when no data is available the read returns -1, allowing the thread to continue, whereas BIO blocks the thread. NIO combined with I/O multiplexing lets a few threads manage many sockets via selectors (e.g., Selector, SocketChannel).

Multiplexing mechanisms such as select, poll, and epoll differ: select has a 1024‑fd limit and copies data between user and kernel space; poll removes the fd limit but still requires linear scanning; epoll shares user and kernel space and uses callbacks, reducing CPU waste.

Zero‑Copy

Netty leverages zero‑copy to eliminate unnecessary CPU copies between user and kernel space. Traditional I/O copies data twice (user→kernel and kernel→user). Linux’s sendfile and Netty’s use of file descriptors enable data transfer without these copies, improving throughput.

The diagram shows the master‑worker (boss‑worker) reactor model: the boss thread accepts connections and registers them, while worker threads handle read/write processing.

Off‑Heap Memory

Heap memory allocation is fast but read/write is slower; off‑heap memory allocation is slower but offers faster I/O. Netty can allocate direct buffers, e.g.:

ByteBuffer buffer = ByteBuffer.allocateDirect(10 * 1024 * 1024);

Netty supports both heap and direct buffers, pooled or unpooled, contributing to its performance.

High‑Performance Object Pool

Netty implements a custom object pool called Recycler to reduce allocation overhead. Recycler provides three methods:

get(): obtain an object.

recycle(T, Handle): return an object to the pool.

newObject(Handle): create a new object when the pool is empty.

Internally Recycler uses DefaultHandler, WeakOrderQueue, and Stack, leveraging ThreadLocal storage for efficient reuse.

Overall, Netty’s advanced I/O model, zero‑copy, off‑heap buffers, and optimized object pooling give it a clear advantage over Tomcat for high‑concurrency server applications.

NettyZero CopyOff-heap memoryObject poolIO modelJava networking
Sanyou's Java Diary
Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.