Why Kubernetes OOM Kills Use WSS, Not RSS – Diagnose & Fix Container Memory
After moving IoT services to Kubernetes, containers were OOM‑killed despite RSS staying below limits because Kubernetes bases OOM decisions on the Working Set Size (WSS) metric, which includes file cache, and the article explains its calculation, reproduces the issue, and offers practical mitigation strategies.
Background and Failure Phenomenon
After migrating IoT business services to Kubernetes, containers were frequently OOM‑killed. Although the container's RSS and the monitored RSS metric stayed within the memory limit, the monitored metric wss (container_memory_working_set_bytes) often exceeded 95 %.
We discovered that Kubernetes uses wss , not RSS, as the basis for OOM kills.
How WSS Is Calculated
Kubernetes obtains memory metrics from the cAdvisor component, which reads two files in the container’s cgroup filesystem. The relevant source code shows the relationship:
Further source excerpts illustrate that container_memory_working_set_bytes = rss + cache - total_inactive_file. In other words, it includes the file‑system cache.
Thus:
container_memory_rss = usual RSS.
container_memory_usage_bytes = rss + cache.
container_memory_working_set_bytes = rss + cache – total_inactive_file.
Reproducing the Issue
Prepare a program (m.c) that can precisely control memory usage.
Prepare a script (s.sh) that continuously writes a file of a specified size.
Run both inside a severely memory‑constrained container.
Before starting the write script, docker stats shows 96 MB (76 %) memory usage. After the script starts writing a 1 GB file, memory quickly climbs to 99 % and the m program is killed.
Kernel logs confirm an OOM kill triggered by the container’s memory cgroup.
Solutions
Clear Logs
Emptying large log files reduces the cache component of WSS, which was the earliest production workaround.
Drop All Cache
Writing “3” to /proc/sys/vm/drop_caches frees all page cache, but the /proc filesystem is read‑only inside containers, so the operation must be performed on the host and affects all containers on the node.
Drop Cache for Specific Files
The vmtouch tool (compiled from source) can evict cache for selected files inside a container.
After mounting the compiled binary into the container, running vmtouch -e lowers WSS, though it adds complexity and can consume ~10 % CPU.
Adjust Kernel Parameters
Increasing vm.vfs_cache_pressure and vm.min_free_kbytes accelerates cache reclamation, effectively reducing WSS.
After tuning, the same write test no longer triggers OOM; memory usage stays around 88 %.
These approaches demonstrate how file‑system cache contributes to WSS‑based OOM kills and provide practical mitigation techniques for Kubernetes clusters.
G7 EasyFlow Tech Circle
Official G7 EasyFlow tech channel! All the hardcore tech, cutting‑edge innovations, and practical sharing you want are right here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
