Understanding Linux Virtual Memory: Address Space Management, Segmentation, and Paging
This article explains how Linux manages virtual memory by describing the organization of virtual and physical address spaces, the mechanisms of segmentation and paging, multi‑level page tables, and the layout of user‑mode and kernel‑mode memory, including relevant kernel code examples.
Linux separates user‑mode and kernel‑mode memory management: user processes cannot directly access physical memory and must use virtual memory, while the kernel handles the actual physical memory through its memory‑management subsystem.
The article focuses on three key questions: how the virtual address space is managed, how the physical address space is managed, and how the two are mapped.
Virtual Address Space Management
Linux treats virtual addresses as consisting of a segment selector and an offset (in segmentation) or a page number and offset (in paging). The segment selector indexes a segment descriptor table (GDT) where each entry contains a base address, limit, and privilege level. In modern Linux the segment base is set to zero, indicating limited use of segmentation.
#define GDT_ENTRY_KERNEL32_CS 1
#define GDT_ENTRY_KERNEL_CS 2
#define GDT_ENTRY_KERNEL_DS 3
#define GDT_ENTRY_DEFAULT_USER32_CS 4
#define GDT_ENTRY_DEFAULT_USER_DS 5
#define GDT_ENTRY_DEFAULT_USER_CS 6
DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
#ifdef CONFIG_X86_64
[GDT_ENTRY_KERNEL32_CS] = GDT_ENTRY_INIT(0xc09b, 0, 0xfffff),
[GDT_ENTRY_KERNEL_CS] = GDT_ENTRY_INIT(0xa09b, 0, 0xfffff),
[GDT_ENTRY_KERNEL_DS] = GDT_ENTRY_INIT(0xc093, 0, 0xfffff),
[GDT_ENTRY_DEFAULT_USER32_CS] = GDT_ENTRY_INIT(0xc0fb, 0, 0xfffff),
[GDT_ENTRY_DEFAULT_USER_DS] = GDT_ENTRY_INIT(0xc0f3, 0, 0xfffff),
[GDT_ENTRY_DEFAULT_USER_CS] = GDT_ENTRY_INIT(0xa0fb, 0, 0xfffff),
#else
[GDT_ENTRY_KERNEL_CS] = GDT_ENTRY_INIT(0xc09a, 0, 0xfffff),
[GDT_ENTRY_KERNEL_DS] = GDT_ENTRY_INIT(0xc092, 0, 0xfffff),
[GDT_ENTRY_DEFAULT_USER_CS] = GDT_ENTRY_INIT(0xc0fa, 0, 0xfffff),
[GDT_ENTRY_DEFAULT_USER_DS] = GDT_ENTRY_INIT(0xc0f2, 0, 0xfffff),
...
#endif
} };
EXPORT_PER_CPU_SYMBOL_GPL(gdt_page);Paging Mechanism
Paging divides physical memory into fixed‑size pages (typically 4 KB). A virtual address is split into a page number and an offset. In a 32‑bit system with a single‑level page table, 20 bits select the page and 12 bits give the offset, requiring 4 MB of page‑table memory per process. To reduce this overhead, Linux uses multi‑level paging (two‑level on 32‑bit, four‑level on 64‑bit), where the page‑directory points to page‑tables, which in turn point to physical pages.
Translation steps for a two‑level page table:
Use the top 10 bits of the virtual address to locate a page‑directory entry (PDE) that references a page‑table.
Use the next 10 bits to select a page‑table entry (PTE) containing the physical page number and flags.
Add the low 12‑bit offset to the physical page base to obtain the final physical address.
User‑Mode Memory Layout
The kernel represents each process with struct task_struct , which contains a pointer to struct mm_struct . The mm_struct holds fields such as mmap (linked list of memory areas), task_size (user‑space limit), total_vm (total pages), and others that describe code, data, stack, heap, and argument regions.
struct task_struct {
...
struct mm_struct *mm;
...
};During execve, the kernel calls setup_new_exec to initialise mm_struct , then maps the ELF binary sections (code, data) via elf_map , sets up the stack with setup_arg_pages , and creates the initial heap with set_brk . The resulting user‑space layout includes text, data, BSS, heap, stack, and the argument/environment vectors.
Kernel‑Mode Memory Layout
All processes share a single kernel virtual address space. In 32‑bit kernels the layout is roughly:
0–896 MiB: direct‑map region (virtual = physical + PAGE_OFFSET).
8 MiB hole for guard pages.
VMALLOC region for dynamic kernel allocations.
PKMAP/PER‑CPU region for persistent mappings.
FIXED mapping area for special purposes.
In 64‑bit kernels the layout starts at 0xffff800000000000 with a large hole, followed by a direct‑map region, a vmalloc region, and the same persistent/fixed mapping areas, but the address space is vastly larger (up to 128 TiB for both user and kernel).
brk System Call (Heap Management)
The brk syscall (defined as SYSCALL_DEFINE1(brk, unsigned long, brk) ) adjusts the process heap. If the new break aligns to the same page as the old one, the change is handled within that page. If it crosses a page boundary, the kernel either releases pages with __do_munmap (when shrinking) or allocates new pages via find_vma and updates the relevant vm_area_struct (when expanding).
360 Smart Cloud
Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.