Why Does Oracle JDK Leave CLOSE_WAIT Sockets Open? A Deep Dive and Reproduction
This article documents a real‑world troubleshooting session of an Oracle JDK bug that caused a massive increase in CLOSE_WAIT connections, detailing the problem analysis, code inspection, reproduction steps, and the subsequent bug report to Oracle.
After a system rollout, the number of TCP connections in the CLOSE_WAIT state grew rapidly, triggering multiple alerts and indicating a resource‑leak problem.
Analysis Process
Step 1: Problem Focus
The first step was to identify which IP pairs were generating the CLOSE_WAIT sockets using netstat: netstat -np | grep tcp | grep "CLOSE_WAIT" Sample output showed many lines similar to:
tcp 3547 0 10.107.17.xxx:34602 yyy.12.230.115:443 CLOSE_WAIT 19819/javaStep 2: IP Investigation
Three IP addresses (e.g., yyy.12.230.115 and zzz.202.32.241) were identified as the primary sources of the problem. Further investigation revealed they belong to an image CDN service.
Step 3: Code Analysis
Reviewing the Java source that fetched images uncovered the use of javax.imageio.ImageIO.read(URL). The method opens a TCP connection to the URL, reads the stream, but does not explicitly close the underlying socket when the image cannot be retrieved, leaving the socket in CLOSE_WAIT.
public static BufferedImage read(URL input) throws IOException {
if (input == null) {
throw new IllegalArgumentException("input == null!");
}
InputStream istream = null;
try {
// Establish TCP connection and obtain stream
istream = input.openStream();
} catch (IOException e) {
throw new IIOException("Can't get input stream from URL!", e);
}
ImageInputStream stream = createImageInputStream(istream);
BufferedImage bi;
try {
bi = read(stream);
if (bi == null) {
stream.close();
}
} finally {
istream.close();
}
return bi;
}The missing socket closure can cause the observed CLOSE_WAIT accumulation, especially when the CDN returns errors or time‑outs.
Step 4: Reproduction
A small Java program was created to simulate the issue by repeatedly invoking ImageIO.read on a non‑existent image URL using a thread pool of 100 workers and 5,000 tasks:
public static void main(String[] args) throws InterruptedException {
ExecutorService ex = Executors.newFixedThreadPool(100);
for (int i = 0; i < 5000; i++) {
ex.execute(task());
}
}
private static Runnable task() {
return new Runnable() {
@Override
public void run() {
String url = "https://vivobbs.xx.yy.zz/invalid.jpg";
File file = null;
BufferedImage image = null;
try {
file = File.createTempFile("abc", "jpg");
URL u = new URL(url);
image = ImageIO.read(u);
} catch (Throwable e) {
e.printStackTrace();
} finally {
if (file != null) file.delete();
if (image != null) image.flush();
}
}
};
}Running this program reproduced a large number of CLOSE_WAIT sockets, confirming the hypothesis.
Step 5: Bug Reporting
The findings were reported to Oracle, including logs and screenshots of the TCP state and email communication. Oracle acknowledged the issue and accepted the bug report.
Overall, the case highlights a gap in the JDK’s handling of socket lifecycles for failed image reads and underscores the need for deeper understanding of TCP state transitions when diagnosing similar problems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
