Linux Kernel Source Analysis: Understanding Heap Memory Management
This article explains how the Linux kernel implements heap memory management, covering the heap data structure, the mm_struct fields start_brk and brk, the brk/sbrk and malloc/free allocation methods, the brk and mmap system calls, and the internal glibc structures heap_info and malloc_state that track heap state.
Heap
The heap is a dynamic‑allocation data structure that stores and manages objects created at runtime. It occupies a contiguous region of memory.
Conceptually the heap consists of a chain of memory blocks, each block holding an address pointer to the next. When a program requests memory, the heap allocator finds a sufficiently large continuous region and returns it, allowing objects to be created and destroyed without compile‑time size knowledge.
Unlike the statically allocated stack, heap allocation is not automatic; it requires explicit calls such as malloc or new, and the memory must be released explicitly with free or delete. This manual control lets programs adjust memory usage to actual needs.
Heap Memory Management
In Linux, heap memory is the region used for dynamic allocation. It is managed by the programmer through functions like malloc(), calloc(), or realloc().
Key characteristics of heap memory:
Variable size – the heap can grow or shrink at runtime.
Manual management – developers must allocate and free memory, avoiding leaks and dangling pointers.
Random access – any allocated block can be accessed directly.
Long lifetime – a block persists until it is explicitly freed or the process exits.
The heap is a contiguous virtual‑address region that typically grows upward (from lower to higher addresses). The Linux kernel implements heap management via system calls and internal functions; developers interact with it through the brk and mmap system calls.
The kernel’s mm_struct structure holds the fields start_brk (heap start address) and brk (heap end address). The address range between them defines the current heap size. start_brk – the starting virtual address of the process’s heap. brk – the current end address of the heap.
By adjusting the value returned by the brk system call, the kernel can expand or shrink the heap, thereby changing the virtual address space occupied by the heap.
mm_struct start_brk and brk members
start_brkrepresents the heap’s starting address. brk represents the heap’s ending address.
The virtual address space also contains other regions (code, data, stack). The heap occupies the region between start_brk and brk.
Dynamic Allocation of Heap Memory
Linux provides two common mechanisms for dynamic heap allocation:
brk() and sbrk() – the original POSIX interfaces. brk() sets the heap’s end address directly; sbrk() increments or decrements the end address to allocate or free memory.
malloc() and free() – higher‑level C library functions that allocate a block of the requested size and return a pointer, handling bookkeeping internally.
brk/sbrk are simple and give direct control over heap size but require careful manual management. malloc/free are easier to use and safer, but their implementation involves locks and additional bookkeeping to be thread‑safe.
brk System Call
The definition of the brk system call resides in mm/mmap.c:
SYSCALL_DEFINE1(brk, unsigned long, brk)
{
struct mm_struct *mm = current->mm; // current process memory descriptor
unsigned long newbrk, oldbrk;
down_write(&mm->mmap_sem); // protect against concurrent access
oldbrk = mm->brk; // save old end address
newbrk = PAGE_ALIGN(brk); // align to page boundary
if (newbrk < mm->start_brk || newbrk > TASK_SIZE) {
up_write(&mm->mmap_sem);
return -ENOMEM; // invalid range
}
if (expand_brk(mm, newbrk)) {
up_write(&mm->mmap_sem);
return -ENOMEM; // expansion failed
}
mm->brk = newbrk; // update end address
up_write(&mm->mmap_sem);
return oldbrk; // return previous end address
}Calling brk with a larger value expands the heap; calling it with a smaller value contracts the heap, releasing the excess virtual pages.
mmap System Call
The mmap system call is defined in the same source file:
SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
unsigned long, prot, unsigned long, flags,
unsigned long, fd, off_t, offset)
{
/* ... */
if (!(flags & MAP_FIXED)) {
addr = vm_mmap_pgoff(file, addr, len, prot, flags, offset);
if (unlikely(IS_ERR_VALUE(addr)))
return addr;
goto out;
}
/* ... */
out:
return addr;
}On success, mmap returns the start address of the mapped region, which may differ from the requested addr if addr was NULL. On failure it returns MAP_FAILED.
heap_info
In glibc, each heap is described by a heap_info structure defined in malloc/malloc.c. Important fields include: ar_ptr – pointer to the arena the heap belongs to. prev and next – links to adjacent heap_info structures. size – heap size in bytes. mprotect_size – size of the unused tail protected by mprotect. pad – padding to make the structure 32 bytes. free_list – head of the free‑block list for this heap.
typedef struct _heap_info {
mstate ar_ptr; /* arena pointer */
struct _heap_info *prev; /* previous heap */
size_t size; /* size in bytes */
size_t mprotect_size; /* protected tail size */
char pad[-6 * SIZE_SZ & MALLOC_ALIGN_MASK]; /* alignment padding */
} heap_info;malloc_state
The malloc_state structure (also called the arena header) tracks the state of a thread’s heap. Key members include: mutex – protects the structure in multithreaded contexts. fastbinsY[NFASTBINS] – fast bins for small allocations. top – pointer to the top of the unallocated region. last_remainder – leftover space from the most recent small allocation. bins[NBINS * 2 - 2] – ordinary bins organized by size class. binmap[BINMAPSIZE] – bitmap indicating which bins are non‑empty. max_fast – upper limit for fast‑bin sizes. fastbins[NFASTBINS] – fast‑bin lists for recently freed small blocks. unsorted_chunks – list of unsorted free chunks. system_mem – total memory obtained from the system. max_system_mem – maximum allowed system memory for the arena.
Each thread has its own independent malloc_state, ensuring that heap operations in one thread do not interfere with another.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
