Fundamentals 31 min read

Unlocking Linux Memory Management: From Virtual Memory to Kernel Allocation

This article explains Linux’s comprehensive memory management system, covering physical and virtual memory concepts, paging, page tables, the MMU, the buddy allocator, slab allocator, memory reclamation strategies such as LRU and swap, monitoring tools, and practical optimization techniques for both user‑space and kernel‑space allocations.

Deepin Linux
Deepin Linux
Deepin Linux
Unlocking Linux Memory Management: From Virtual Memory to Kernel Allocation

1. Linux Memory Management

1.1 Physical and Virtual Memory

Physical memory (RAM) refers to the actual memory modules installed on the motherboard, providing fast read/write access for the CPU and serving as temporary storage for the operating system and applications.

Virtual memory extends usable memory by allocating space on the disk (pagefile.sys) to act as additional RAM, typically sized at 1–2 × the physical memory.

Why use virtual memory? When programs collectively require more RAM than is physically available, the operating system moves inactive pages to disk, allowing active processes to continue operating without interruption.

Virtual memory workflow consists of six steps:

① The CPU splits a logical address into a group number a and an offset b, then looks up the address translation table using a.

② If the group is present in physical memory, proceed to step ④; otherwise, locate a free area or evict a less‑used group to secondary storage.

③ Load the required group from secondary storage into a free physical area and record the mapping in the address translation table.

④ Retrieve the physical group number corresponding to the logical group number from the translation table.

⑤ Combine the physical group number with the offset b to obtain the physical address.

⑥ Access the needed information in main memory using the physical address.

1.2 Paging Mechanism

Paging divides memory into fixed‑size blocks called pages (commonly 4 KB). Larger pages such as 64 KB or 2 MB (HugePages) are also supported on some architectures. Fixed page sizes simplify management and improve allocation efficiency.

The page table is the core data structure of paging, mapping virtual pages to physical pages. Each process has its own page table. On x86_64 Linux, a four‑level hierarchy (PGD → PUD → PMD → PTE) is used.

When the CPU receives a virtual address, it traverses the hierarchy to locate the corresponding physical page frame, then adds the page offset to form the final physical address.

2. Key Memory Management Components

2.1 Memory Management Unit (MMU)

The MMU translates virtual addresses to physical addresses using page tables. It also enforces access permissions (read, write, execute) and raises exceptions on violations, protecting processes from illegal memory accesses.

To speed up translation, the MMU caches recent page‑table entries in the Translation Lookaside Buffer (TLB). A TLB hit avoids a full page‑table walk, reducing latency.

2.2 Buddy System

The buddy allocator manages large physical memory blocks. Memory is split into power‑of‑two sized chunks (1, 2, 4, 8 pages, etc.) and organized into separate free lists. Allocation finds the smallest fitting chunk; if none is available, a larger chunk is split into two “buddies”.

When a block is freed, the allocator checks whether its buddy is also free; if so, they are merged, recursively reducing fragmentation.

2.3 Slab Allocator

The slab allocator handles frequent allocation of small kernel objects (e.g., task_struct, inode). It maintains caches (slabs) of pre‑allocated objects of a given size, allowing fast allocation and reuse without invoking the buddy system.

Each slab consists of one or more contiguous pages divided into equal‑sized slots. Slabs can be in three states: free, partially used, or full. The allocator keeps a pool of free slabs to satisfy allocation requests quickly.

3. Memory Allocation and Reclamation

3.1 Allocation Process

In user space, programs typically use malloc. Small allocations are served from the heap (expanded via brk), while large allocations (usually >128 KB) are satisfied with mmap, creating a separate memory‑mapped region.

#include <stdio.h>
#include <stdlib.h>
int main(){
    int *ptr = (int *)malloc(10 * sizeof(int));
    if (ptr == NULL) {
        printf("Memory allocation failed
");
        return 1;
    }
    // use ptr
    // forgetting to free(ptr) causes a leak
    return 0;
}

In kernel space, large allocations use the buddy system, while small, frequent allocations use the slab allocator. The kernel first searches the appropriate free list; if none matches, it splits a larger block, or falls back to a slab cache.

3.2 Reclamation Strategies

The kernel employs an LRU‑based page reclamation algorithm. Pages are placed on two doubly‑linked lists: active_list (frequently accessed) and inactive_list (rarely accessed). Pages with the PG_referenced flag cleared are candidates for eviction.

When memory pressure rises, the kernel scans inactive_list from the tail, writing clean pages to swap or discarding them, then frees the physical frames. Active pages with a cleared reference flag are demoted to the inactive list, ensuring the most active pages stay resident.

Swap space provides additional backing storage. Anonymous pages are written to swap partitions, while file‑backed pages are written back to their files. The kswapd kernel thread monitors low‑watermarks and triggers reclamation as needed.

The /proc/sys/vm/swappiness parameter (0‑100) controls the aggressiveness of swapping anonymous pages versus reclaiming file cache.

4. Monitoring and Optimizing Memory Usage

4.1 Monitoring Tools

free shows a quick overview of total, used, free, buffered, cached, and available memory.

total        used        free      shared  buff/cache   available
Mem:          7.8Gi       317Mi       6.0Gi       1.0Mi       1.4Gi       7.2Gi
Swap:         4.0Gi          0B       4.0Gi

top provides a real‑time view, with columns such as VIRT, RES, and SHR. Press Shift+M to sort processes by memory usage.

vmstat reports system activity at intervals, including swap in/out rates ( si, so) and free memory.

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
1  0      0 4777576 153688 1299500   0    0     0     0 1500 2098  0  0 98  0  0

The virtual file /proc/meminfo contains detailed counters (MemTotal, MemFree, MemAvailable, Buffers, Cached, etc.) that many tools parse for deeper analysis.

MemTotal:        3855952 kB
MemFree:         2040864 kB
MemAvailable:    3356504 kB
Buffers:           39224 kB
Cached:          1400764 kB
SwapTotal:            0 kB
SwapFree:             0 kB
...

4.2 Optimization Strategies

When memory pressure is observed, increasing swap size can provide relief. Create a swap partition with fdisk or parted, format it with mkswap /dev/sdXn, and enable it with swapon /dev/sdXn. Alternatively, create a swap file:

dd if=/dev/zero of=/swapfile bs=1M count=2048   # 2 GB file
mkswap /swapfile
swapon /swapfile

Application‑level optimizations include eliminating memory leaks (always pair malloc with free) and using tools like valgrind to detect leaks.

For Java applications, tune the JVM heap with -Xms and -Xmx to avoid frequent garbage‑collection cycles.

System‑wide tuning can adjust vm.swappiness (e.g., set to 10 in /etc/sysctl.conf and apply with sysctl -p) to favor keeping file cache in RAM and reducing unnecessary swapping.

memory managementLinuxOperating systemMMUSwapPaging
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.