Backend Development 13 min read

Investigation of the "Too many open files" Error in Tomcat with Apollo Configuration Center

This article analyzes a production incident where a Java web application using Apollo configuration center encountered "Too many open files" errors, detailing the fault symptoms, root cause analysis involving Tomcat's classloader and file‑descriptor limits, and presenting remediation and preventive measures.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Investigation of the "Too many open files" Error in Tomcat with Apollo Configuration Center

The author, a technical expert from Ctrip's framework R&D department, describes a recurring "Too many open files" error observed on Linux when a Java web application integrated with Apollo configuration center modified its configuration in production.

After the configuration change, the application began to throw massive errors because Redis connections failed, as shown by the stack trace:

Caused by: redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
Caused by: java.net.SocketException: Too many open files

Initial investigation revealed that only 5 out of 20 machines successfully received the configuration notification, while the remaining 15 reported NoClassDefFoundError for com.ctrip.framework.apollo.model.ConfigChange . The author hypothesized that the shortage of file descriptors prevented the JVM from loading the required JAR files.

Further analysis identified the root cause: the process limit Max Open Files was set to 4096 on many machines (instead of the intended 65536). When the configuration change triggered Tomcat's WebappClassLoader to load a previously unused class, it opened all dependent JAR files at once, quickly exhausting the file‑descriptor limit and causing both the NoClassDefFoundError and Redis connection failures.

Empirical verification was performed using lsof commands. Immediately after a configuration push, the number of open file handles jumped from ~192 to ~422, with ~228 handles belonging to WEB-INF/lib JAR files. After about 30 seconds, Tomcat's background thread released the handles, returning the count to ~194.

The article also documents the class‑loading mechanism of Tomcat 7.0.72, showing how it initially opens all JAR files, searches for the required class, and later closes the files via a periodic cleanup thread.

Based on the findings, the author proposes several optimization measures:

Increase the operating‑system Max Open Files limit for production services.

Enhance application monitoring and alerting for connection counts and file‑descriptor usage.

Initialize middleware clients early at startup to preload classes and avoid on‑the‑fly loading.

When incidents occur, retain a problematic instance for post‑mortem analysis instead of immediate restart.

In summary, the incident was caused by an insufficient file‑descriptor limit combined with Tomcat's class‑loader behavior, leading to cascading failures in Redis connectivity and service availability.

JavaRedisClassLoaderTomcatApollofile descriptorsToo many open files
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.