Understanding Linux Memory Management: From Process Allocation to OOM
This article explores Linux kernel memory management, detailing process address space layout, allocation mechanisms, OOM handling, various mmap mappings, cache behavior, tmpfs usage, and the kernel's automatic memory reclamation strategies.
Before writing this blog, the author became interested in Linux kernel memory management after an OOM talk during an internship. The article records and shares the knowledge gained through study.
1. Process Memory Allocation
When a program is started, the terminal calls exec to load the executable, mapping the code, data, bss, and stack segments into memory via mmap. The heap is mapped only when memory is requested. After exec, the dynamic linker loads required shared libraries before the process begins execution, which can be traced with strace.
The first malloc triggers a brk system call; if no heap VMA exists, the kernel creates one with an anonymous mmap and adds it to the process's mm_struct. The user‑space allocator (ptmalloc, tcmalloc, jemalloc, etc.) then manages the returned virtual memory, eventually backing it with physical pages when accessed. Large allocations may use mmap directly, returning virtual memory that becomes physical on first access. free on mmap ‑allocated memory calls munmap; otherwise the memory is returned to the allocator, which later releases it back to the kernel. All memory is reclaimed when the process exits.
2. OOM After Memory Exhaustion
When the system runs out of memory, the OOM killer selects a process to terminate. Selection considers not only memory usage but also runtime, priority, user ID, child count, and the oom_adj value. Each process receives an oom_score; the highest score is killed.
The score can be influenced via /proc/<pid>/oom_adj. Setting oom_adj to -17 makes a process immune to OOM killing, while the range is -16 to 15.
The kernel parameter /proc/sys/vm/overcommit_memory controls allocation policies:
0 – heuristic overcommit: allow allocations that are not excessively larger than physical memory.
1 – always allow overcommit.
2 – strict limit based on swap + RAM * overcommit_ratio (default 50%).
These settings affect whether a large virtual allocation (e.g., a 24 GB Redis instance on an 8 GB machine) triggers OOM.
3. Where Does Allocated Memory Reside?
Memory can be mapped as shared file mappings, private file mappings, private anonymous mappings, or shared anonymous mappings.
3.1 Shared File Mapping
Code and shared libraries are mapped from the executable file. Creating a large file and mapping it shows an increase in buff/cache, indicating the pages are stored in the kernel page cache.
3.2 Private File Mapping
Data segments are privately mapped. Modifying such a mapping triggers copy‑on‑write: the kernel copies the page to a new private location before the write, increasing both used and buff/cache.
3.3 Private Anonymous Mapping
Segments like bss, heap, and stack are anonymous and private. Allocating 1 GB via mmap shows an increase only in used, not in buff/cache, because the pages are not cached.
3.4 Shared Anonymous Mapping
When parent and child processes share memory via mmap with MAP_SHARED, the pages reside in the cache, so buff/cache grows while used reflects the shared usage.
4. System Memory Reclamation
4.1 Manual Reclamation
Writing to /proc/sys/vm/drop_caches can free page cache (value 1), dentries and inodes (value 2), or both (value 3). Dirty pages must be flushed with sync before they can be dropped.
4.2 tmpfs
tmpfs, procfs, sysfs, and ramfs are memory‑based filesystems. Unlike ramfs, tmpfs can use swap and has size limits. Files created in tmpfs are stored in the page cache; creating a 1 GB file in /dev/shm increases buff/cache accordingly.
4.3 Shared Memory
POSIX and System V shared memory are implemented by creating files in tmpfs and mapping them, so they also occupy page cache and cannot be reclaimed while referenced.
4.4 Automatic Reclamation
The kernel’s kswapd daemon periodically scans LRU lists and reclaims pages until free memory reaches a high watermark. File pages are freed directly or written back if dirty; anonymous pages are swapped out. The vm.swappiness parameter controls the tendency to swap anonymous pages versus reclaiming cache.
5. Summary
The article reviewed the Linux process address space, explained how memory allocation, OOM handling, and various mmap mappings interact with the page cache, described manual and automatic reclamation mechanisms, and highlighted the role of tmpfs and shared memory in cache usage.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
