Why Kubernetes Pods Fail with “Resource temporarily unavailable” – Understanding PID Limits
This article analyzes a Kubernetes‑Docker environment where Java pods encounter “fork: Resource temporarily unavailable” errors, tracing the issue through kernel event logs, ulimit settings, PID limits, and related sysctl parameters, and provides detailed recommendations for kernel and user‑level configuration to prevent such failures.
Background
Runtime environment: Kubernetes + Docker, application: Java.
Problem Description
1. Kubernetes event center shows a regular cluster alarm, which does not reveal the root cause.
2. Initial suspicion fell on Docker service; docker & kubelet logs were examined, revealing that kubelet failed to initialize threads due to insufficient process limits, requiring an increase of the user’s ulimit -u.
$ journalctl -u "kubelet" --no-pager --follow
... (log excerpt showing runtime errors and thread creation failures) ...3. System logs indicated a fork failure: fork: Resource temporarily unavailable.
$ dmesg -TL
-bash: fork: retry: No child processes
[Fri Sep 17 18:25:53 2021] Linux version 5.11.1-1.el7.elrepo.x86_64 ...4. Attempting to start a new container resulted in thread‑initialization failures due to insufficient resources.
$ docker run -it --rm tomcat bash
runtime/cgo: pthread_create failed: Resource temporarily unavailable
SIGABRT: abort
...Fault Analysis
The initial hypothesis was that ulimit -u was too low. The current limits were inspected:
$ ulimit -a
... max user processes (-u) 249047 ...Configuration in /etc/security/limits.conf and /etc/security/limits.d/20.nproc.conf showed no excessive limits.
# limits.conf defaults
root soft nofile 65535
root hard nofile 65535
* soft nofile 65535
* hard nofile 65535
# 20.nproc.conf
* soft nproc 65536
root soft nproc unlimitedMonitoring indicated a peak of 457 processes and a maximum thread count of 32,616, which did not exceed the ulimit -u value.
Further investigation of kernel parameters revealed relevant PID‑related settings:
kernel.core_uses_pid = 1 (affects core dump naming, not the issue)
kernel.ns_last_pid = 23068 (last allocated PID)
kernel.pid_max = 32768 (maximum PID value)
user.max_pid_namespaces = 253093 (max PID namespaces per user)
kernel.cad_pid = 1 (reboot signal, unrelated)
Checking kernel.pid_max:
$ sysctl -a | grep pid_max
kernel.pid_max = 32768Analysis of source code showed the default and limits for pid_max and threads-max:
int pid_max = PID_MAX_DEFAULT;
#define RESERVED_PIDS 300
int pid_max_min = RESERVED_PIDS + 1;
int pid_max_max = PID_MAX_LIMIT;For kernel.threads-max, the kernel calculates the value based on total RAM pages, ensuring thread structures consume only a small portion of memory.
static void set_max_threads(unsigned int max_threads_suggested) {
u64 threads;
unsigned long nr_pages = totalram_pages();
if (fls64(nr_pages) + fls64(PAGE_SIZE) > 64)
threads = MAX_THREADS;
else
threads = div64_u64((u64)nr_pages * (u64)PAGE_SIZE,
(u64)THREAD_SIZE * 8U);
if (threads > max_threads_suggested)
threads = max_threads_suggested;
max_threads = clamp_t(u64, threads, MIN_THREADS, MAX_THREADS);
}The constant FUTEX_TID_MASK (0x3fffffff) bounds kernel.threads-max.
#define FUTEX_TID_MASK 0x3fffffffThe vm.max_map_count parameter limits the number of VMA (virtual memory area) mappings per process, defaulting to 65,530.
# cat /proc/sys/vm/max_map_count
65530Configuration Recommendations
Parameter Boundaries
Parameter Name
Scope
kernel.pid_max
System‑wide limit
kernel.threads-max
System‑wide limit
vm.max_map_count
Process‑level limit
/etc/security/limits.conf
User‑level limit
Summary Recommendations
Adjust kernel.pid_max when a node runs many containers or high‑concurrency workloads to avoid “Resource temporarily unavailable” errors.
Leave kernel.threads-max at its kernel‑calculated value; manually increasing it may cause memory exhaustion.
Tune vm.max_map_count only after performance testing; setting it too high can increase memory overhead.
Ensure user‑level limits in limits.conf do not exceed the global kernel limits.
Linux does not impose per‑CPU thread creation limits; memory availability is the primary constraint.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
