Fundamentals 21 min read

How Linux Manages User and Kernel Memory: A Deep Dive into mm_struct and Virtual Address Layout

This article explains the complete memory layout of a Linux process, detailing user‑space structures like mm_struct and vm_area_struct, the role of TASK_SIZE, how ELF binaries are loaded, the sys_brk implementation, and the differences between 32‑bit and 64‑bit kernel virtual address spaces.

Liangxu Linux
Liangxu Linux
Liangxu Linux
How Linux Manages User and Kernel Memory: A Deep Dive into mm_struct and Virtual Address Layout

User‑Space Memory Layout

When a process starts, its virtual address space is divided into several regions: code, global variables (BSS), stack, heap, and memory‑mapped areas. The kernel represents these regions through struct mm_struct, which is pointed to by task_struct->mm.

Key fields of mm_struct include:

struct mm_struct *mm;
unsigned long task_size;   /* size of the task's virtual address space */
task_size

defines the boundary between user‑space and kernel‑space. On 32‑bit systems #define TASK_SIZE PAGE_OFFSET (0xC0000000), giving 3 GB for user space and 1 GB for kernel space. On 64‑bit systems the limit is ((1UL << 47) - PAGE_SIZE), yielding a 128 TB user space and an equally large kernel space.

mm_struct Fields Describing User‑Space Regions

unsigned long mmap_base;   /* base of mmap area */
unsigned long total_vm;   /* total pages mapped */
unsigned long locked_vm;  /* pages locked in memory */
unsigned long pinned_vm;  /* pages that cannot be moved */
unsigned long data_vm;    /* VM_WRITE & ~VM_SHARED & ~VM_STACK */
unsigned long exec_vm;    /* VM_EXEC & ~VM_WRITE & ~VM_STACK */
unsigned long stack_vm;   /* VM_STACK */
unsigned long start_code, end_code;
unsigned long start_data, end_data;
unsigned long start_brk, brk, start_stack;
unsigned long arg_start, arg_end, env_start, env_end;

These members track the size and location of each region. For example, start_brk and brk delimit the heap; start_stack marks the stack base; mmap_base is the starting address for memory‑mapped areas, which grow downward from high addresses.

vm_area_struct – Describing Individual Regions

struct vm_area_struct *mmap;   /* list of VMAs */
struct rb_root mm_rb;

Each virtual memory area (VMA) is represented by struct vm_area_struct:

struct vm_area_struct {
    unsigned long vm_start;   /* start address */
    unsigned long vm_end;     /* end address (first byte after) */
    struct vm_area_struct *vm_next, *vm_prev;
    struct rb_node vm_rb;
    struct mm_struct *vm_mm;
    const struct vm_operations_struct *vm_ops;
    struct file *vm_file;    /* file backing the VMA, if any */
    void *vm_private_data;
} __randomize_layout;
vm_start

and vm_end define the region, while the linked‑list and red‑black‑tree fields allow fast lookup and modification.

Loading an ELF Binary (load_elf_binary)

The function load_elf_binary performs several steps to set up the process address space:

Calls setup_new_exec to initialise mmap_base.

Calls setup_arg_pages to create the stack VMA and set mm->arg_start and mm->start_stack.

Maps the ELF code segment with elf_map.

Initialises the heap via set_brk, setting mm->start_brk = mm->brk.

Maps required shared libraries with load_elf_interp.

After these actions the process has a complete memory‑mapping diagram.

sys_brk System Call Implementation

SYSCALL_DEFINE1(brk, unsigned long, brk)
{
    unsigned long newbrk, oldbrk;
    struct mm_struct *mm = current->mm;
    unsigned long retval;
    newbrk = PAGE_ALIGN(brk);
    oldbrk = PAGE_ALIGN(mm->brk);
    if (oldbrk == newbrk)
        goto set_brk;
    /* Shrink */
    if (brk <= mm->brk) {
        if (!do_munmap(mm, newbrk, oldbrk - newbrk, &uf))
            goto set_brk;
        goto out;
    }
    /* Expand – check for existing mappings */
    next = find_vma(mm, oldbrk);
    if (next && newbrk + PAGE_SIZE > vm_start_gap(next))
        goto out;
    if (do_brk(oldbrk, newbrk - oldbrk, &uf) < 0)
        goto out;
set_brk:
    mm->brk = brk;
    return brk;
out:
    retval = mm->brk;
    return retval;
}

The call aligns the old and new break values to page boundaries, shrinks the heap with do_munmap when decreasing, or expands it after checking for overlapping VMAs. Expansion ultimately calls do_brk, which creates or merges a new vm_area_struct for the heap.

do_brk and VMA Creation

static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf)
{
    return do_brk_flags(addr, len, 0, uf);
}

static int do_brk_flags(unsigned long addr, unsigned long request,
                       unsigned long flags, struct list_head *uf)
{
    struct mm_struct *mm = current->mm;
    struct vm_area_struct *vma, *prev;
    unsigned long len = PAGE_ALIGN(request);
    find_vma_links(mm, addr, addr + len, &prev, &rb_link, &rb_parent);
    vma = vma_merge(mm, prev, addr, addr + len, flags, NULL, NULL, pgoff, NULL, NULL_VM_UFFD_CTX);
    if (vma)
        goto out;
    vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
    vma->vm_mm = mm;
    vma->vm_start = addr;
    vma->vm_end = addr + len;
    vma->vm_pgoff = pgoff;
    vma->vm_flags = flags;
    vma_link(mm, vma, prev, rb_link, rb_parent);
out:
    mm->total_vm += len >> PAGE_SHIFT;
    mm->data_vm += len >> PAGE_SHIFT;
    if (flags & VM_LOCKED)
        mm->locked_vm += (len >> PAGE_SHIFT);
    vma->vm_flags |= VM_SOFTDIRTY;
    return 0;
}

If the new heap region can be merged with an existing VMA, no new structure is allocated; otherwise a fresh vm_area_struct is created and linked into both the list and the red‑black tree.

Kernel‑Space Layout (32‑bit)

In 32‑bit kernels the virtual address space is 1 GB, with the lower 896 MB used for a direct‑mapped zone that provides a 1‑to‑1 mapping to the first 896 MB of physical memory (virtual address minus PAGE_OFFSET).

Key regions: __pa(vaddr) converts a virtual address to its physical counterpart. __va(paddr) does the reverse.

High memory (above 896 MB) is accessed via kmap / kmap_atomic after allocating a struct page.

Between 896M and VMALLOC_START lies an 8 MB gap. VMALLOC_STARTVMALLOC_END is the kernel’s dynamic allocation area (vmalloc). PKMAP_BASEFIXADDR_START is the persistent kernel mapping area for high‑memory pages. FIXADDR_STARTFIXADDR_TOP is reserved for fixed mappings.

Kernel structures such as task_struct, kernel stacks, and code/data segments are placed within the direct‑mapped zone.

Kernel‑Space Layout (64‑bit)

64‑bit kernels have a vastly larger virtual address space, eliminating the need for a separate high‑memory region. The layout includes:

From 0xffff800000000000 upward: kernel space, with an initial 8 TB hole. __PAGE_OFFSET_BASE (0xffff880000000000) begins a 64 TB direct‑mapped region where virtual‑address – PAGE_OFFSET yields the physical address. VMALLOC_START (0xffffc90000000000) to VMALLOC_END (0xffffe90000000000): a 32 TB vmalloc area. VMEMMAP_START (0xffffea0000000000): a 1 TB region storing struct page descriptors for all physical pages. __START_KERNEL_map (0xffffffff80000000): a 512 MB region containing the kernel’s code, globals, and BSS.

This completes the description of Linux’s virtual memory layout for both user and kernel spaces.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

memory managementmm_structsys_brkVirtual Address Space
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.