Fundamentals 15 min read

How Does a Linux Process Use Memory? A Deep Dive

This article explains the Linux kernel’s memory management for a process, covering virtual address spaces, page‑fault handling, VMA allocation, stack and heap initialization, sbrk/brk usage, thread‑stack creation, and the glibc ptmalloc heap allocator with concrete code examples and diagrams.

Linux Kernel Journey
Linux Kernel Journey
Linux Kernel Journey
How Does a Linux Process Use Memory? A Deep Dive

Each Linux process has its own virtual address space represented by a mm_struct inside the process’s task_struct. The virtual memory is divided into regions called VMAs, each described by a vm_area_struct.

When a process accesses a page that has not yet been backed by physical memory, a user‑mode page‑fault occurs. The kernel entry do_user_addr_fault locates the relevant VMA via find_vma, then calls handle_mm_fault__handle_mm_fault. The fault information is packed into a struct vm_fault containing the faulting address and pointers to the page‑table entries ( pmd, pud, pte).

Using the four‑level page table, __handle_mm_fault checks each level and, if missing, allocates the required page‑table pages. For anonymous memory the path continues to do_anonymous_page, which calls alloc_zeroed_user_highpage_movable and ultimately alloc_pages to obtain a physical page.

During process startup, after the ELF file is parsed, the kernel creates the address space, allocates a 4 KB stack, maps the required shared libraries via elf_map, and initializes the heap. All of these actions allocate a VMA through vm_area_alloc. For the stack, __bprm_mm_init sets vma->vm_end = STACK_TOP_MAX and vma->vm_start = vma->vm_end - PAGE_SIZE. For the executable and its libraries, mmap_region creates VMAs, and for the heap do_brk_flags expands the VMA using vma_merge or a new allocation.

The sbrk / brk system calls adjust the program break stored in mm_struct->brk. An example program shows sbrk(0) to read the current break, brk(curr_brk+4096) to grow it, and brk(old_brk) to shrink it, with the resulting changes visible in /proc/<pid>/maps.

All threads in a process share the same address space, but each thread has an independent stack. pthread_create eventually calls __pthread_create_2_1, which invokes allocate_stack to determine the stack size, obtains memory via get_cached_stack or mmap, and fills a struct pthread (fields tid, stackblock, stackblock_size). The stack is limited to 32 MB, and automatic growth is handled by expand_stackexpand_downwards, which checks limits via acct_stack_growth (visible with ulimit). Thread creation finally calls the kernel clone system call, creating a new task_struct that shares the parent’s mm_struct, fs_struct, and file descriptor table.

The glibc ptmalloc heap allocator is illustrated with its main arena ( static struct malloc_state main_arena) and the malloc_chunk layout (header fields mchunk_prev_size, mchunk_size, and linked‑list pointers). Free chunks are organized into fastbins, smallbins, largebins, unsortedbins, and the top chunk. Macros such as fastbin_index(sz), smallbin_index(sz), and largebin_index_64(sz) compute the appropriate bin. Allocation starts in __int_malloc, which normalizes the request size, then tries fastbins, smallbins, unsortedbins, and finally the top chunk; if the top chunk is insufficient, sysmalloc obtains more memory via mmap.

struct vm_fault {
  const struct {
    struct vm_area_struct *vma; /* faulting VMA */
    unsigned long address;      /* faulting address */
  };
  pmd_t *pmd;   /* level‑2 page‑table entry */
  pud_t *pud;   /* level‑3 page‑table entry */
  pte_t *pte;   /* level‑4 page‑table entry */
};
static int __bprm_mm_init(struct linux_binprm *bprm) {
  bprm->vma = vma = vm_area_alloc(mm);
  vma->vm_end = STACK_TOP_MAX;
  vma->vm_start = vma->vm_end - PAGE_SIZE;
  bprm->p = vma->vm_end - sizeof(void *);
}
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <stdlib.h>
int main() {
  void *curr_brk, *tmp_brk = NULL;
  tmp_brk = curr_brk = sbrk(0);
  getchar();
  brk(curr_brk+4096);
  curr_brk = sbrk(0);
  getchar();
  brk(tmp_brk);
  curr_brk = sbrk(0);
  getchar();
  return 0;
}
int __pthread_create_2_1(pthread_t *newthread, const pthread_attr_t *attr,
          void *(*start_routine)(void *), void *arg) {
    struct pthread *pd = NULL;
    int err = allocate_stack(iattr, &pd, &stackaddr, &stacksize);
    retval = create_thread(pd, iattr, &stopped_start, stackaddr,
            stacksize, &thread_ran);
}
static struct malloc_state main_arena = {
  .mutex = _LIBC_LOCK_INITIALIZER,
  .next = &main_arena,
  .attached_threads = 1
};
struct malloc_chunk {
  INTERNAL_SIZE_T mchunk_prev_size; /* size of previous chunk */
  INTERNAL_SIZE_T mchunk_size;      /* size of this chunk */
  struct malloc_chunk *fd;          /* forward link (free list) */
  struct malloc_chunk *bk;          /* backward link */
  struct malloc_chunk *fd_nextsize;/* large bin forward link */
  struct malloc_chunk *bk_nextsize;/* large bin backward link */
};
#define fastbin_index(sz) ((((unsigned int)(sz)) >> (SIZE_SZ == 8 ? 4 : 3)) - 2)
#define smallbin_index(sz) ((SMALLBIN_WIDTH == 16 ? ((unsigned)(sz)) >> 4 : ((unsigned)(sz)) >> 3) + SMALLBIN_CORRECTION)
#define largebin_index_64(sz) (\
    ((((unsigned long)(sz)) >> 6) <= 48) ? 48 + (((unsigned long)(sz)) >> 6) : \
    ((((unsigned long)(sz)) >> 9) <= 20) ? 91 + (((unsigned long)(sz)) >> 9) : \
    ((((unsigned long)(sz)) >> 12) <= 10) ? 110 + (((unsigned long)(sz)) >> 12) : \
    ((((unsigned long)(sz)) >> 15) <= 4) ? 119 + (((unsigned long)(sz)) >> 15) : \
    ((((unsigned long)(sz)) >> 18) <= 2) ? 124 + (((unsigned long)(sz)) >> 18) : 126)
static void *_int_malloc(mstate av, size_t bytes) {
    INTERNAL_SIZE_T nb = checked_request2size(bytes);
    if ((unsigned long)nb <= (unsigned long)get_max_fast()) {
        /* fastbin allocation */
    }
    if (in_smallbin_range(nb)) {
        /* smallbin allocation */
    }
    for (;;) {
        /* search unsortedbins */
        victim = unsorted_chunks(av)->bk;
        if (++iters >= MAX_ITERS) break;
    }
use_top:
    victim = av->top;
    size = chunksize(victim);
    void *p = sysmalloc(nb, av);
}
static void *sysmalloc(INTERNAL_SIZE_T nb, mstate av) {
    mm = sysmalloc_mmap(nb, mp_.hp_pagesize, mp_.hp_flags, av);
    return mm;
}
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LinuxStackVirtual MemoryHeapmallocPage Fault
Linux Kernel Journey
Written by

Linux Kernel Journey

Linux Kernel Journey

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.