Fundamentals 20 min read

Understanding Linux Kernel Memory Management: OOM, mmap, and Cache

This article explains how Linux manages process memory, covering allocation via mmap and brk, OOM killer selection, the role of page cache in different mapping types, manual and automatic memory reclamation, and related kernel parameters.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Understanding Linux Kernel Memory Management: OOM, mmap, and Cache

Process Memory Allocation and Mapping

When a program is started, the kernel exec() loads the ELF binary and creates the initial memory mappings. The code (text) segment, read‑only data, initialized data, BSS and the initial stack are mapped into the process address space using mmap(). The heap is created on the first malloc() call. If the allocator can extend the program break, the kernel expands the brk region; otherwise it creates an anonymous mmap() region and adds a VMA (virtual memory area) to the process’s mm_struct. User‑space allocators such as ptmalloc, tcmalloc or jemalloc manage these virtual regions and request physical pages from the kernel. Large allocations are often satisfied directly by mmap(). When memory is freed, munmap() is called for mmap() ed regions, otherwise the allocator returns the pages to its free pool, which may later release them back to the kernel.

Out‑of‑Memory (OOM) Handling

If the system cannot satisfy a memory allocation, the OOM killer is invoked. For each process the kernel computes an oom_score based on factors such as total RSS, runtime, nice value, root privilege, number of child processes and the value written to /proc/[pid]/oom_adj (range –16 … 15; –17 makes the process immune). The process with the highest score is selected for termination. The behaviour of memory over‑commit can be tuned with /proc/sys/vm/overcommit_memory:

0 – heuristic over‑commit (default).

1 – always allow over‑commit.

2 – strict limit: allocations are limited to swap + RAM * overcommit_ratio.

Where Allocated Memory Resides

Shared file mapping

Executable code and shared libraries are mapped from the underlying file into the page cache. A test that creates a 1 GiB file and maps it shows the buff/cache column of free increasing by roughly 1 GiB, confirming that the pages are cached.

Private file mapping

Data segments are mapped as private file mappings. After dropping caches with echo 1 > /proc/sys/vm/drop_caches and remapping, both used and buff/cache increase, demonstrating copy‑on‑write: the file is first cached, then a private copy is allocated on the first write.

Private anonymous mapping

Heap, BSS and stack are anonymous private mappings. Creating a 1 GiB anonymous mapping increases only the used column; buff/cache stays unchanged because the pages are not part of the page cache.

Shared anonymous mapping

When a parent and child share memory via mmap(MAP_SHARED|MAP_ANONYMOUS), the pages are allocated from the page cache. Only buff/cache grows, showing that the shared anonymous pages are backed by the cache.

System Memory Reclamation

Manual reclamation

Writing to /proc/sys/vm/drop_caches forces the kernel to free reclaimable memory:

1 – drop clean page cache.

2 – drop dentries and inodes.

3 – drop both.

Dirty pages must be flushed first with sync.

tmpfs and in‑memory filesystems

tmpfs, procfs, sysfs and ramfs store their files directly in the page cache. Creating a file under /dev/shm increases buff/cache, confirming that tmpfs pages are cached. Because the pages are referenced by a file, they cannot be dropped by drop_caches.

POSIX and System V shared memory

Both mechanisms are implemented on top of tmpfs: the kernel creates a temporary file (via shmem_kernel_file_setup) and maps it into the processes. The pages remain non‑reclaimable while the shared memory object is open.

Automatic reclamation

The kernel daemon kswapd periodically scans the LRU lists. When free pages fall below pages_low, it moves inactive pages from the active list to the inactive list and frees them until the high‑water mark pages_high is reached. File pages are written back if dirty; clean pages are simply freed. Anonymous pages have no backing store, so they are swapped out. The sysctl /proc/sys/vm/vfs_cache_pressure (0‑100) controls the relative preference for reclaiming cache versus swapping; higher values favour swapping.

Summary

The Linux kernel builds a process address space from file‑backed and anonymous mappings, manages allocation through brk and mmap, and reclaims memory via manual drop_caches, tmpfs‑based IPC, and the automatic kswapd mechanism. When reclamation cannot free enough pages, the OOM killer selects a victim based on oom_score and terminates it.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CachemmapOOMSwapmemory-management
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.