Fundamentals 20 min read

How Linux Kernel Manages Memory: Allocation, OOM, and Recovery

This article explains Linux kernel memory management by covering process address space layout, allocation mechanisms, OOM killer behavior, overcommit settings, various types of file and anonymous mappings, tmpfs usage, and both manual and automatic memory reclamation techniques.

Efficient Ops
Efficient Ops
Efficient Ops
How Linux Kernel Manages Memory: Allocation, OOM, and Recovery

1. Process Memory Allocation

When a program is started, the terminal calls

exec

to load the executable into memory; the code, data, BSS, and stack segments are mapped via

mmap

, while the heap is created on demand. After

exec

, the dynamic linker loads required shared libraries before the process begins execution, which can be traced with

strace

.

On the first

malloc

, the kernel handles a

brk

system call. If no heap VMA exists, the kernel creates an anonymous mapping with

mmap

and adds the VMA to the process's red‑black tree. The user‑space allocator (ptmalloc, tcmalloc, jemalloc, etc.) then subdivides this region and returns the requested block. Large allocations may bypass the heap and use

mmap

directly; the returned memory is virtual until first accessed, at which point physical pages are allocated.

When

free

is called, memory obtained via

mmap

is released with

munmap

. Memory obtained via the heap is returned to the allocator, which may later give it back to the kernel.

2. OOM After Memory Exhaustion

The OOM (Out‑of‑Memory) killer selects a process to terminate when the system runs out of memory. Selection factors include memory usage, runtime, priority, user ID, number of child processes, and the

oom_adj

score. The kernel computes an

oom_score

for each process; the highest score is killed.

Administrators can influence the decision by writing to

/proc/<pid>/oom_adj

. Values range from –16 (immune) to 15 (most likely to be killed). Setting

oom_adj

to –17 gives a process VIP‑like protection.

The

/proc/sys/vm/overcommit_memory

setting controls allocation policy:

0 – heuristic OOM: modest over‑commit is allowed, but huge virtual allocations trigger OOM.

1 – always allow over‑commit; OOM occurs only when physical memory is truly exhausted.

2 – never exceed

swap + RAM * overcommit_ratio

; allocation fails once the limit is reached.

3. Where Allocated Memory Resides

Linux uses two main mapping types:

File mappings (code, data, shared libraries) are cached in the page cache. When multiple processes map the same file, they share the same physical pages.

Anonymous mappings (heap, BSS, stack,

malloc

via

brk

or

mmap

) are not backed by a file and reside in regular RAM until swapped out.

Experiments show that shared file mappings increase

buff/cache

, while private anonymous mappings increase only

used

memory.

Shared anonymous mappings (e.g.,

mmap

with

MAP_SHARED

) also use the page cache; the memory appears in

buff/cache

and is visible to all participating processes.

Tmpfs (including

/dev/shm

) creates files in a memory‑backed filesystem. These files are stored in the page cache and cannot be reclaimed while they are referenced, but they can be swapped out.

POSIX and System V shared memory are implemented on top of tmpfs, so their pages are also part of the page cache and share the same reclamation constraints.

4. Memory Reclamation

4.1 Manual Reclamation

Writing to

/proc/sys/vm/drop_caches

forces the kernel to drop clean caches:

echo 1 > /proc/sys/vm/drop_caches # drop page cache echo 2 > /proc/sys/vm/drop_caches # drop dentries and inodes echo 3 > /proc/sys/vm/drop_caches # drop both

Dirty pages must be flushed with

sync

before they can be dropped.

4.2 Automatic Reclamation

The kernel’s

kswapd

daemon periodically scans LRU lists. It moves inactive pages to the reclaimable list and frees them until the free‑page target (

pages_high

) is reached. When memory pressure exceeds a critical threshold, a more aggressive reclaim pass runs.

File pages are reclaimed by writing back dirty data and then freeing the cache. Anonymous pages are reclaimed by swapping them out to disk.

The

vm.swappiness

parameter (0‑100) controls the balance between swapping anonymous pages and reclaiming cache; higher values favor swapping.

5. Summary

The article reviewed the Linux process address space, explained how memory is allocated via

brk

and

mmap

, described the OOM killer’s decision process and over‑commit policies, distinguished between file‑backed and anonymous mappings, and covered both manual (

drop_caches

) and automatic (kswapd, swap) memory reclamation mechanisms.

Memory ManagementkernelLinuxMMAPoomswap
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.