Fundamentals 20 min read

Understanding Linux Memory Management: From Process Allocation to OOM

This article explores Linux kernel memory management, detailing process address space layout, allocation mechanisms, OOM handling, various mmap mappings, cache behavior, tmpfs usage, and the kernel's automatic memory reclamation strategies.

Efficient Ops
Efficient Ops
Efficient Ops
Understanding Linux Memory Management: From Process Allocation to OOM

Before writing this blog, the author became interested in Linux kernel memory management after an OOM talk during an internship. The article records and shares the knowledge gained through study.

1. Process Memory Allocation

When a program is started, the terminal calls exec to load the executable, mapping the code, data, bss, and stack segments into memory via mmap. The heap is mapped only when memory is requested. After exec, the dynamic linker loads required shared libraries before the process begins execution, which can be traced with strace.

The first malloc triggers a brk system call; if no heap VMA exists, the kernel creates one with an anonymous mmap and adds it to the process's mm_struct. The user‑space allocator (ptmalloc, tcmalloc, jemalloc, etc.) then manages the returned virtual memory, eventually backing it with physical pages when accessed. Large allocations may use mmap directly, returning virtual memory that becomes physical on first access. free on mmap ‑allocated memory calls munmap; otherwise the memory is returned to the allocator, which later releases it back to the kernel. All memory is reclaimed when the process exits.

2. OOM After Memory Exhaustion

When the system runs out of memory, the OOM killer selects a process to terminate. Selection considers not only memory usage but also runtime, priority, user ID, child count, and the oom_adj value. Each process receives an oom_score; the highest score is killed.

The score can be influenced via /proc/<pid>/oom_adj. Setting oom_adj to -17 makes a process immune to OOM killing, while the range is -16 to 15.

The kernel parameter /proc/sys/vm/overcommit_memory controls allocation policies:

0 – heuristic overcommit: allow allocations that are not excessively larger than physical memory.

1 – always allow overcommit.

2 – strict limit based on swap + RAM * overcommit_ratio (default 50%).

These settings affect whether a large virtual allocation (e.g., a 24 GB Redis instance on an 8 GB machine) triggers OOM.

3. Where Does Allocated Memory Reside?

Memory can be mapped as shared file mappings, private file mappings, private anonymous mappings, or shared anonymous mappings.

3.1 Shared File Mapping

Code and shared libraries are mapped from the executable file. Creating a large file and mapping it shows an increase in buff/cache, indicating the pages are stored in the kernel page cache.

3.2 Private File Mapping

Data segments are privately mapped. Modifying such a mapping triggers copy‑on‑write: the kernel copies the page to a new private location before the write, increasing both used and buff/cache.

3.3 Private Anonymous Mapping

Segments like bss, heap, and stack are anonymous and private. Allocating 1 GB via mmap shows an increase only in used, not in buff/cache, because the pages are not cached.

3.4 Shared Anonymous Mapping

When parent and child processes share memory via mmap with MAP_SHARED, the pages reside in the cache, so buff/cache grows while used reflects the shared usage.

4. System Memory Reclamation

4.1 Manual Reclamation

Writing to /proc/sys/vm/drop_caches can free page cache (value 1), dentries and inodes (value 2), or both (value 3). Dirty pages must be flushed with sync before they can be dropped.

4.2 tmpfs

tmpfs, procfs, sysfs, and ramfs are memory‑based filesystems. Unlike ramfs, tmpfs can use swap and has size limits. Files created in tmpfs are stored in the page cache; creating a 1 GB file in /dev/shm increases buff/cache accordingly.

4.3 Shared Memory

POSIX and System V shared memory are implemented by creating files in tmpfs and mapping them, so they also occupy page cache and cannot be reclaimed while referenced.

4.4 Automatic Reclamation

The kernel’s kswapd daemon periodically scans LRU lists and reclaims pages until free memory reaches a high watermark. File pages are freed directly or written back if dirty; anonymous pages are swapped out. The vm.swappiness parameter controls the tendency to swap anonymous pages versus reclaiming cache.

5. Summary

The article reviewed the Linux process address space, explained how memory allocation, OOM handling, and various mmap mappings interact with the page cache, described manual and automatic reclamation mechanisms, and highlighted the role of tmpfs and shared memory in cache usage.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CacheLinuxmmapOOM
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.