Mastering ulimit and cgroup: Limit Files & Threads in Docker/Kubernetes
This article explains how Linux's ulimit and cgroup mechanisms can be used to restrict file descriptors and thread counts in Docker and Kubernetes environments, compares configuration methods, presents experimental results, and offers practical recommendations for setting limits at the container, pod, and host levels.
Background
In Linux, the ulimit command limits resource usage of processes (including file descriptors, thread count, memory size, etc.) to prevent malicious consumption. In containerized environments, similar resource limits are required.
Limiting Methods
ulimit : Docker supports --ulimit per container and default-ulimits in the daemon for all containers. Kubernetes does not currently support ulimit directly.
cgroup : Docker supports memory, CPU, PID limits via cgroups. Thread limits can be set with --pids-limit . Kubernetes can limit threads by enabling the SupportPodPidsLimit feature and setting --pod-max-pids on the kubelet.
Configuration files : /etc/security/limits.conf and /etc/sysctl.conf provide permanent system‑wide limits. ulimit only affects the current session.
Experimental Comparison
Environment
Local: Ubuntu 16.04.6 LTS, Docker 18.09.7, base image alpine:v3.9 Kubernetes: kubelet v1.10.11.1, Docker 18.09.6
ulimit
User‑level resource limits have soft and hard thresholds.
soft: can be raised by the user but not beyond the hard limit
hard: can only be lowered by root
Modification methods: temporary via ulimit command, permanent via /etc/security/limits.conf. The limits are applied through PAM modules when a user logs in.
File Descriptor Limit
RLIMIT_NOFILE
This specifies a value one greater than the maximum file descriptor number that can be opened by this process.
Attempts to exceed this limit yield the error EMFILE.
Since Linux 4.5, this limit also defines the maximum number of file descriptors that an unprivileged process may have "in flight" to other processes via UNIX domain sockets.The nofile limit controls the maximum number of open files per process.
Set ulimit -n 100:200 (soft 100, hard 200) for the container.
$ docker run -d --ulimit nofile=100:200 cr.d.xiaomi.net/containercloud/alpine:webtool topInside the container, the soft limit is 100.
# ulimit -a
-f: unlimited
-t: unlimited
-d: unlimited
-s: 8192
-c: unlimited
-m: unlimited
-l: 64
-p: unlimited
-n: 100
-v: unlimited
-w: unlimited
-e: 0
-r: 0Run ApacheBench with 90 concurrent HTTP requests – works.
# ab -n 1000000 -c 90 http://61.135.169.125:80/ &
# lsof | wc -l
108
# lsof | grep -c ab
94Run ApacheBench with 100 concurrent requests – fails due to ulimit limit.
# ab -n 1000000 -c 100 http://61.135.169.125:80/
socket: No file descriptors available (24)Thread Limit
RLIMIT_NPROC
This is a limit on the number of extant processes (or threads) for the real user ID of the calling process.
If the current number of processes for the UID reaches this limit, <code>fork(2)</code> fails with EAGAIN.
The limit is not enforced for processes with CAP_SYS_ADMIN or CAP_SYS_RESOURCE.The nproc limit applies per UID and is ineffective for the root user.
Container UID
All containers on the same host share the host kernel. Docker isolates PID, UTS, network namespaces, but user namespaces are disabled by default.
$ docker run -d cr.d.xiaomi.net/containercloud/alpine:webtool topInside the container, the process runs as UID 0 (root), but Linux capabilities are reduced compared to the host root.
# id
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon)...
# su operator
$ id
uid=11(operator) gid=0(root) groups=0(root)
$ sleep 100 &
$ ps -ef | grep sleep
app 19302 ... sleep 100Verifying ulimit Under Different Users
Set ulimit -u 10:20 (soft 10, hard 20) for the container, default user root.
# ulimit -a
-p: processes 10
-n: file descriptors 1048576Start 30 background processes – succeeds.
# for i in `seq 30`; do sleep 100 &; done
# ps | wc -l
36Switch to operator user and attempt to start more processes – fails after the 11th process.
# su operator
$ for i in `seq 8`; do sleep 100 &; done
$ sleep 100 &
sh: can't fork: Resource temporarily unavailableVerifying ulimit Across Containers with Same UID
Set ulimit -u 3:3 for the operator user and launch four containers; the fourth fails to start.
$ docker run -d --ulimit nproc=3:3 --name nproc1 -u operator ...
$ docker run -d --ulimit nproc=3:3 --name nproc2 -u operator ...
$ docker run -d --ulimit nproc=3:3 --name nproc3 -u operator ...
$ docker run -d --ulimit nproc=3:3 --name nproc4 -u operator ...
$ docker ps -a | grep nproc
nproc4 Exited (1) ...Summary
ulimit limits total file descriptors per process and applies to all users.
ulimit limits total threads per UID; ineffective for root.
In production, a small probability exists that ulimit causes fork failures, especially when multiple containers share the same UID and a thread leak occurs.
cgroup
cgroup isolates PIDs; configuring Docker or kubelet can limit total PIDs, thereby limiting thread count. The effective limit is the minimum of all configured values.
Docker: use --pids-limit when starting a container.
Kubelet: enable SupportPodPidsLimit and set --pod-max-pids (e.g., 150).
# kubelet ... --feature-gates=SupportPodPidsLimit=true --pod-max-pids=150 ...Start 100 threads as root – succeeds (106 processes).
# for i in `seq 100`; do sleep 1000 &; done
# ps | wc -l
106Start threads as operator – limited to 150 processes.
# su operator
# for i in `seq 100`; do sleep 1000 &; done
sh: can't fork: Resource temporarily unavailable
# ps | wc -l
150cgroup shows PID count reaching the limit.
# cat /sys/fs/cgroup/pids/.../pids.current
150
# cat /sys/fs/cgroup/pids/.../pids.max
150limits.conf / sysctl.conf
limits.confdefines ulimit settings; files in /etc/security/limits.d/ override it. sysctl.conf provides machine‑level limits (e.g., fs.file-max, kernel.pid_max).
# docker run -d --ulimit nofile=100:200 ...
# docker exec -it <container> sh
# ulimit -a
-n: file descriptors 100
# echo "fs.file-max=5" >> /etc/sysctl.conf
# sysctl -p
sysctl: error setting key 'fs.file-max': Read-only file systemModifying sysctl inside a container can affect the host because Docker’s isolation is not complete.
Conclusion
Recommended solutions:
File descriptor limit: modify Docker daemon default-ulimits to set process‑level FD limits.
Thread limit: configure kubelet with --feature-gates=SupportPodPidsLimit=true and --pod-max-pids to limit PIDs at the cgroup level.
Other considerations: adjust node pid.max parameter; relax or increase nproc limits for non‑root users in images.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
