Why Java 21 Virtual Threads Can Stall Tomcat: A Deep Dive into CloseWait Sockets and Lock Contention
This article investigates a production issue on Java 21 where virtual threads cause Tomcat to stop processing requests, closeWait sockets to accumulate, and locks to deadlock, detailing the diagnostics, thread‑dump analysis, lock inspection, and the eventual conclusions about virtual‑thread behavior.
0 Preface
In a previous post we described the performance benefits of migrating to Java 21 and using ZGC as the default garbage collector. Virtual threads were another exciting feature we adopted during that migration.
1 Problem
Several engineers reported intermittent time‑outs and suspended instances on services running on Java 21, Spring Boot 3, and embedded Tomcat. The affected instances stopped serving traffic while the JVM was still running. A clear symptom was a continuously increasing number of sockets in the closeWait state.
Tomcat throughput suddenly dropped to near zero.
The number of closeWait sockets kept rising, indicating connections were not being closed properly.
These two metrics are correlated and suggest a serious networking or application‑level problem.
2 Collected Diagnostics
The persistent closeWait sockets indicated that the remote peer closed the socket but the local instance never did, likely because the application failed to close it. Thread dumps from the affected instances were collected, but they showed an entirely idle JVM with no obvious activity.
Reviewing recent changes revealed that the services had enabled virtual threads. Virtual‑thread stack frames do not appear in a normal jstack dump, so we used jcmd Thread.dump_to_file to obtain a dump that includes virtual‑thread states, and also captured a heap dump as a last resort.
3 Analysis
The thread dump displayed thousands of “blank” virtual threads such as:
#119821 "" virtual
#119820 "" virtual
#119823 "" virtual
#120847 "" virtual
#119822 "" virtual
...These are virtual threads that have been created but have not yet started, so they have no stack trace. Their count roughly matches the number of closeWait sockets.
Virtual threads are not a 1:1 mapping to OS threads; they are tasks scheduled on a fork‑join pool. When a virtual thread blocks (e.g., on a Future), it releases its carrier OS thread and stays in memory until it can be resumed. The carrier OS thread can then be reused for other virtual threads.
Virtual thread details are described in JEP 444.
In our environment Tomcat used a blocking model, keeping a worker thread for the whole request lifecycle. Enabling virtual threads switched Tomcat to use a VirtualThreadExecutor, creating a new virtual thread for each incoming request.
The symptom therefore corresponds to Tomcat continuously creating new web‑worker virtual threads while no OS carrier threads are available to run them.
4 Why Tomcat Stalls
If a virtual thread executes a synchronized block or method, it becomes pinned to its carrier OS thread. Several of the captured stack traces show virtual threads parked inside synchronized code, for example:
#119515 "" virtual
java.base/jdk.internal.misc.Unsafe.park(Native Method)
java.base/java.lang.VirtualThread.parkOnCarrierThread(VirtualThread.java:661)
...
brave.RealSpan.finish(RealSpan.java:134)
...These virtual threads are pinned to the four OS threads of the fork‑join pool (the instance has 4 vCPU). All four OS threads are occupied, so newly created virtual threads cannot be scheduled, leaving their sockets open and causing the closeWait count to climb.
5 Who Holds the Lock?
Thread‑dump analysis showed six threads contending for the same ReentrantLock and its associated Condition. Four virtual threads were pinned; one virtual thread was not pinned but was still waiting for the lock; and one regular platform thread was also waiting.
Because Java 21’s jcmd dump does not include lock owner information, we turned to the heap dump.
6 Checking the Lock
Using Eclipse MAT we inspected the heap for the lock object (found in AbstractQueuedSynchronizer.java). The exclusiveOwnerThread field was null, indicating no thread currently owned the lock. The lock’s wait‑queue showed an ExclusiveNode whose waiter pointed to one of the virtual threads (#119516).
Further inspection of ReentrantLock.Sync.tryRelease() revealed that after a successful release the lock signals the next node in the queue, but the head of the queue still references the thread that has just released the lock. This creates a transient state where the lock appears unowned while the next thread has not yet acquired it.
The simplified acquisition loop in AbstractQueuedSynchronizer.acquire() is:
while (true) {
if (tryAcquire()) {
return; // lock acquired
}
park();
}All waiting threads were parked at line 754 of this loop. When the lock is released, the next thread is unparked, retries the loop, and should acquire the lock, resetting the queue head. However, in our dump the thread #119516 remained parked because no OS carrier thread was free to run it.
7 The Lock That Never Runs
Four virtual threads were pinned to the four OS carrier threads and could not release the lock. The regular platform thread (#107) could acquire the lock, but the remaining virtual thread (#119516) was not pinned and therefore could not make progress without a free OS thread. This situation is a variant of a classic deadlock: the lock is held by threads that cannot run because the fork‑join pool’s permits are exhausted.
8 Conclusion
Virtual threads promise performance gains by reducing thread‑creation and context‑switch overhead. Java 21’s implementation works well for many workloads, but edge cases like lock contention with synchronized blocks can still cause stalls. Adopting virtual threads carefully and monitoring for such patterns is essential for high‑performance Java applications, and future JDK releases are expected to address these integration issues.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JavaEdge
First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
