Cloud Native 5 min read

Why Kubernetes Pods Fail with “Too Many Open Files” and How to Fix It

The article explains the “Too many open files” error in Kubernetes, clarifies that it refers to exceeding system file‑handle limits, shows how to inspect current usage with ulimit and lsof, and provides step‑by‑step commands to temporarily or permanently raise the limits and troubleshoot the application code.

Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Why Kubernetes Pods Fail with “Too Many Open Files” and How to Fix It

The error message "Failed to create pod sandbox... socket: too many open files" indicates that the process has exceeded the operating system's limit on open file descriptors, which includes regular files, sockets, and other handles.

Root cause analysis

When a process opens more files or network connections than the system permits, the kernel returns this error. You can view the current limit with ulimit -a, which shows the maximum number of open handles allowed.

Check current open file count

Run lsof | wc -l to see the total number of open files on the host.

Use watch "lsof | wc -l" to monitor the count in real time.

Check a specific process

Run lsof -p <pid> | wc -l (e.g., lsof -p 1234 | wc -l) to see how many handles a given process holds.

Solution: increase the allowed number of open files

Temporary (lost after reboot): ulimit -n 1024000 (non‑root users are limited to 4096).

Permanent (requires reboot):

Edit /etc/security/limits.conf.

Add the following lines at the end:

* soft nofile 1024000
* hard nofile 1024000

Inspect the application for handle leaks

If you control the program, estimate the expected number of file or socket handles. When the count looks abnormal, capture detailed information with lsof -p <pid> > openfiles.log and analyze the log.

Are all opened files actually needed?

Locate the code that opens these files.

Check whether files are closed properly after writing.

Verify that network connections are closed (e.g., using post.releaseConnection()), noting that this may only return the connection to a pool rather than truly closing it.

Improperly closed sockets can remain in a pool for several seconds; under heavy load, the accumulated sockets can exceed the system's ulimit -u limit, triggering the “Too many open files” exception.

KubernetesDevOpsulimitsystem limitsToo many open fileslsof
Full-Stack DevOps & Kubernetes
Written by

Full-Stack DevOps & Kubernetes

Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.