How Linux Kernel Manages Memory: Allocation, OOM, and Recovery
This article explains Linux kernel memory management by covering process address space layout, allocation mechanisms, OOM killer behavior, overcommit settings, various types of file and anonymous mappings, tmpfs usage, and both manual and automatic memory reclamation techniques.
1. Process Memory Allocation
When a program is started, the terminal calls exec to load the executable into memory; the code, data, BSS, and stack segments are mapped via mmap, while the heap is created on demand. After exec, the dynamic linker loads required shared libraries before the process begins execution, which can be traced with strace.
On the first malloc, the kernel handles a brk system call. If no heap VMA exists, the kernel creates an anonymous mapping with mmap and adds the VMA to the process's red‑black tree. The user‑space allocator (ptmalloc, tcmalloc, jemalloc, etc.) then subdivides this region and returns the requested block. Large allocations may bypass the heap and use mmap directly; the returned memory is virtual until first accessed, at which point physical pages are allocated.
When free is called, memory obtained via mmap is released with munmap. Memory obtained via the heap is returned to the allocator, which may later give it back to the kernel.
2. OOM After Memory Exhaustion
The OOM (Out‑of‑Memory) killer selects a process to terminate when the system runs out of memory. Selection factors include memory usage, runtime, priority, user ID, number of child processes, and the oom_adj score. The kernel computes an oom_score for each process; the highest score is killed.
Administrators can influence the decision by writing to /proc/<pid>/oom_adj. Values range from –16 (immune) to 15 (most likely to be killed). Setting oom_adj to –17 gives a process VIP‑like protection.
The /proc/sys/vm/overcommit_memory setting controls allocation policy:
0 – heuristic OOM: modest over‑commit is allowed, but huge virtual allocations trigger OOM.
1 – always allow over‑commit; OOM occurs only when physical memory is truly exhausted.
2 – never exceed swap + RAM * overcommit_ratio; allocation fails once the limit is reached.
3. Where Allocated Memory Resides
Linux uses two main mapping types:
File mappings (code, data, shared libraries) are cached in the page cache. When multiple processes map the same file, they share the same physical pages.
Anonymous mappings (heap, BSS, stack, malloc via brk or mmap) are not backed by a file and reside in regular RAM until swapped out.
Experiments show that shared file mappings increase buff/cache, while private anonymous mappings increase only used memory.
Shared anonymous mappings (e.g., mmap with MAP_SHARED) also use the page cache; the memory appears in buff/cache and is visible to all participating processes.
Tmpfs (including /dev/shm) creates files in a memory‑backed filesystem. These files are stored in the page cache and cannot be reclaimed while they are referenced, but they can be swapped out.
POSIX and System V shared memory are implemented on top of tmpfs, so their pages are also part of the page cache and share the same reclamation constraints.
4. Memory Reclamation
4.1 Manual Reclamation
Writing to /proc/sys/vm/drop_caches forces the kernel to drop clean caches:
echo 1 > /proc/sys/vm/drop_caches # drop page cache echo 2 > /proc/sys/vm/drop_caches # drop dentries and inodes echo 3 > /proc/sys/vm/drop_caches # drop both
Dirty pages must be flushed with sync before they can be dropped.
4.2 Automatic Reclamation
The kernel’s kswapd daemon periodically scans LRU lists. It moves inactive pages to the reclaimable list and frees them until the free‑page target ( pages_high) is reached. When memory pressure exceeds a critical threshold, a more aggressive reclaim pass runs.
File pages are reclaimed by writing back dirty data and then freeing the cache. Anonymous pages are reclaimed by swapping them out to disk.
The vm.swappiness parameter (0‑100) controls the balance between swapping anonymous pages and reclaiming cache; higher values favor swapping.
5. Summary
The article reviewed the Linux process address space, explained how memory is allocated via brk and mmap, described the OOM killer’s decision process and over‑commit policies, distinguished between file‑backed and anonymous mappings, and covered both manual ( drop_caches) and automatic (kswapd, swap) memory reclamation mechanisms.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
