Operations 19 min read

Why Linux Triggers OOM Killer and How to Manage Memory Reclamation

This article explains Linux virtual memory, the page‑fault allocation process, the two memory‑reclaim paths (kswapd and direct reclaim), OOM killer scoring, swappiness tuning, NUMA‑aware reclamation, and practical steps to protect critical processes from being killed.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Why Linux Triggers OOM Killer and How to Manage Memory Reclamation

Virtual Memory Purpose

Each process has its own page table, giving it an isolated virtual address space that prevents address conflicts between processes. Page‑table entries also store permission bits and presence flags, enhancing memory‑access security.

Memory Allocation and Page Fault Handling

When an application calls malloc, it reserves virtual memory without immediately allocating physical pages. Accessing that memory triggers a page‑fault; the CPU raises a fault, switches to kernel mode, and invokes the Page Fault Handler.

The handler checks for free physical pages. If available, it allocates them and creates a mapping. If not, the kernel starts reclaiming memory, either via asynchronous background reclamation (kswapd) or synchronous direct reclaim.

Memory Reclamation Strategies

Background reclamation (kswapd) : Activated when memory pressure rises; it runs asynchronously and does not block processes.

Direct reclaim : Used when background reclamation cannot keep up; it runs synchronously, blocking the requesting process.

If direct reclaim still cannot satisfy the request, the OOM (Out‑of‑Memory) mechanism is triggered, and the OOM killer selects a victim process.

What Memory Can Be Reclaimed?

File‑backed pages : Clean pages can be freed instantly; dirty pages must be written back to disk before release.

Anonymous pages : Allocated via mmap; reclaimed through the swap subsystem, which writes infrequently used pages to disk.

Both types use an LRU algorithm that maintains active_list (recently used pages) and inactive_list (rarely used pages). Pages nearer the tail of the inactive list are reclaimed first.

Performance Impact of Reclamation

Background reclamation is asynchronous and has minimal impact.

Direct reclaim is synchronous, causing latency spikes and higher CPU utilization.

Reclaiming dirty file pages or swapping anonymous pages incurs disk I/O, which can noticeably degrade system responsiveness.

Adjusting Reclamation Preference

The kernel parameter /proc/sys/vm/swappiness (0‑100) controls the bias toward swapping anonymous pages (higher values) versus reclaiming file pages (lower values). The default is 0, favoring file‑page reclamation.

# cat /proc/sys/vm/swappiness
0

Monitoring Reclamation Activity

Use sar -B 1 to observe metrics: pgscank/s: Pages scanned per second by kswapd. pgscand/s: Pages scanned per second during direct reclaim. pgsteal/s: Pages actually reclaimed per second (sum of the above).

Large pgscand values often indicate frequent direct reclaim, which can cause noticeable jitter.

Memory Watermarks and kswapd Triggering

The kernel defines three watermarks (pages_min, pages_low, pages_high) that partition memory usage into four zones. kswapd runs when pages_free falls below pages_low and stops when it rises above pages_high. If pages_free drops below pages_min, direct reclaim is invoked.

These thresholds are derived from /proc/sys/vm/min_free_kbytes:

pages_min = min_free_kbytes
pages_low = pages_min * 5 / 4
pages_high = pages_min * 3 / 2

Increasing min_free_kbytes can provoke earlier background reclamation, reducing direct reclaim spikes, but it also reserves more memory, potentially limiting usable RAM for applications.

NUMA‑Aware Reclamation

On NUMA systems, memory can be reclaimed locally or from remote nodes. The kernel option /proc/sys/vm/zone_reclaim_mode controls this behavior:

0 (default): Try remote nodes before local reclamation.

1: Reclaim only local memory.

2: Reclaim local memory, writing dirty file pages to disk.

4: Reclaim local memory using swap.

Setting the mode to 0 is recommended to avoid excessive direct reclaim when other nodes have free memory.

Protecting Critical Processes from OOM Kill

The kernel function oom_badness() scores each process based on its resident page count and the configurable oom_score_adj value (range -1000 to 1000). The score is calculated as:

// points = process_pages + oom_score_adj * totalpages / 1000

Higher scores increase the likelihood of being selected by the OOM killer. Adjusting oom_score_adj can lower a process's chance of termination (e.g., set to -1000 for essential services like sshd) or raise it for less critical workloads.

However, setting business‑critical services to -1000 is discouraged because a memory leak could prevent the OOM killer from freeing memory, leading to system instability.

Practical Recommendations

Set /proc/sys/vm/swappiness to a low value (e.g., 0) to favor file‑page reclamation.

Adjust /proc/sys/vm/min_free_kbytes to trigger kswapd earlier, but balance against overall memory availability.

Configure /proc/sys/vm/zone_reclaim_mode to 0 on NUMA servers to allow cross‑node reclamation.

Fine‑tune oom_score_adj for critical processes, avoiding -1000 for regular applications.

By understanding the allocation, reclamation, and OOM scoring mechanisms, administrators can mitigate performance degradation and protect essential services from unexpected termination.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LinuxPage FaultOOM killerNUMAkswapdswappinessmemory-management
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.