Operations 11 min read

Analysis of Linux System Crash Caused by Memory Leak and the Role of min_free_kbytes

This article documents a system crash triggered by a memory‑leak in a network‑chip SDK, explains how low free memory and the kswapd process lead to deadlock, and shows how adjusting the kernel parameter min_free_kbytes can prevent the freeze while highlighting the importance of resource monitoring and tuning.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
Analysis of Linux System Crash Caused by Memory Leak and the Role of min_free_kbytes

The article records a system crash caused by a memory leak during large‑scale configuration operations on a network chip, introducing the kernel parameter min_free_kbytes as a potential mitigation.

Background : The platform consists of a control platform, the Kwai Network Operating System (KNOS), and an ASIC chip. During integration testing, repeated bulk add/delete operations caused the device to ping but SSH to fail, eventually leading to a deadlock.

Problem : The system becomes unresponsive while still reachable via ping, indicating a kernel‑level issue rather than network connectivity.

Investigation Process : Reproduction steps were established, and SSH debug output showed the client hanging after sending its version string, suggesting the server was blocked.

Resource monitoring (using top , free , docker stats , and sar ) revealed a gradual decrease in free memory and an increase in %commit, buffer, and cache usage. The kswapd0 process was observed consuming CPU as memory pressure grew.

Linux Memory Reclamation Mechanism : Linux zones have min , low , and high watermarks. When free memory falls below low , kswapd is awakened; if it drops below min , allocations become synchronous and can block, potentially causing deadlock if memory cannot be reclaimed.

Root Cause : The ASIC SDK leaks memory; as free memory falls below the low watermark, kswapd runs but cannot restore memory to the high watermark, leading to heavy swapping, increased load, and eventual deadlock.

min_free_kbytes : Adjusting this parameter changes the reserved free pages per zone. The article shows how to view and modify it:

cat /proc/sys/vm/min_free_kbytes
sysctl -w vm.min_free_kbytes=265536

After increasing min_free_kbytes , the system no longer deadlocked; instead, memory allocation failures or OOM messages appeared.

Conclusion :

Linux can deadlock under severe memory pressure without triggering OOM.

Proper tuning of kernel parameters like min_free_kbytes is essential for high‑load environments.

Continuous monitoring of CPU, memory, and disk usage aids rapid troubleshooting.

References include links to BMC, ASIC, SSH debugging, SAR tool, physical memory articles, and detailed documentation on min_free_kbytes .

Performance Monitoringlinuxmemory-leakkswapdmin_free_kbytessystem crash
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.