Fundamentals 18 min read

How Go’s Runtime Manages Memory: From Kernel SLAB to mheap and mcache

This article explains Go's memory allocator by comparing it with the Linux kernel SLAB allocator, detailing the page‑heap (mheap), span structures, per‑P caches (mcache), and the mallocgc path for small and large allocations, while showing key source code snippets and diagrams.

IT Services Circle

Dec 22, 2025

How Go’s Runtime Manages Memory: From Kernel SLAB to mheap and mcache

1. Kernel page‑memory management

Modern servers use a NUMA layout: each CPU and its directly attached memory form a node . A node is divided into several zones , each zone contains many pages (normally 4 KB). The kernel tracks free pages with the buddy allocator, represented by struct zone and its free_area array.

The allocator function alloc_pages walks the buddy lists to obtain a contiguous run of physical pages. Kernel subsystems such as the SLAB allocator request pages directly from alloc_pages and then carve objects out of them.

1.1 SLAB cache allocation

The SLAB allocator groups objects of the same size into a cache. Each cache consists of one or more pages; objects are allocated from these pages with minimal fragmentation and low overhead.

2. Go runtime memory management

Go’s allocator follows the same two‑level idea (page heap + size‑class caches) but works on virtual memory. It consists of three main components:

mheap – a page‑heap that obtains large virtual‑memory regions via mmap and manages them with a sparse array of heapArena structures.

mcentral – a global cache of mspan objects, protected by a lock.

mcache – a per‑P (processor) cache that holds mspan lists for fast, lock‑free allocation of small objects.

2.1 Page‑heap (mheap)

When the runtime needs pages it calls runtime.alloc, which eventually invokes mheap.alloc. The global variable mheap_ contains a two‑dimensional array arenas with 4 M entries. Each entry points to a heapArena that manages 64 MB (8192 × 8 KB) of memory. This design can address up to 256 TB of virtual address space.

// runtime/mheap.go (excerpt)
type mheap struct {
    arenas   [1<<arenaL1Bits << arenaL2Bits]*heapArena // sparse array of arenas
    central  [numSpanClasses]struct{ mcentral }
    // …other fields…
}

var mheap_ mheap

Each heapArena stores an array of mspan pointers, a bitmap for GC marking, and a pageInUse bitmap that records which 8 KB pages are allocated.

// runtime/mheap.go (excerpt)
type heapArena struct {
    spans    [pagesPerArena]*mspan // 8192 spans per arena
    bitmap   [heapArenaBitmapBytes]byte
    pageInUse [pagesPerArena/8]uint8 // one bit per page
    // …other fields…
}

The method mheap.grow obtains memory from the OS by calling sysReserve (which wraps mmap), creates a new heapArena, and stores it in the arenas array.

// runtime/malloc.go (excerpt)
func (h *mheap) sysAlloc(n uintptr) (unsafe.Pointer, uintptr) {
    // calls sysReserve / sysReserveAligned → mmap
    // returns a pointer to a newly reserved virtual region
}

2.2 Object allocation (mcache & mspan)

For objects ≤ 32 KB Go uses per‑P caches ( mcache) that hold mspan lists. An mspan describes a contiguous run of pages and the size class of objects it stores.

// runtime/mheap.go (excerpt)
type mspan struct {
    next, prev *mspan
    startAddr  uintptr // start address of the span
    npages    uintptr // number of 8 KB pages in the span
    elemsize  uintptr // size of each object in this span
    nelems    uintptr // how many objects the span can hold
    // …other fields…
}

The fields startAddr and npages locate the memory region; elemsize and nelems define the capacity for a particular size class. When a P’s mcache runs out of spans it pulls a fresh span from the global mcentral, which in turn allocates a new span from mheap. If mheap has no free pages, it grows by requesting more memory from the OS.

2.3 Allocation path (mallocgc)

The central allocation routine mallocgc decides the strategy based on the requested size:

If size ≤ maxSmallSize (≈32 KB), the per‑P mcache supplies an mspan of the appropriate size class. Tiny allocations (< 16 bytes) have a dedicated fast path.

If size > maxSmallSize, the request is rounded up to a whole‑page size and allocated directly from mheap.alloc.

// runtime/malloc.go (excerpt)
func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
    if size <= maxSmallSize {
        // small allocation via mcache
        span := c.allocSmall(size)
        v := nextFreeFast(span) // fast bitmap scan
        // …initialize object…
    } else {
        // large allocation via page heap
        span, _ := h.alloc(size/physPageSize, spanClass, true)
        v := span.base() // address of the allocated pages
        // …initialize object…
    }
    // …additional bookkeeping…
    return v
}

This mirrors the kernel’s SLAB design: large objects are taken directly from a page heap, while small objects are served from size‑class caches to keep fragmentation low and allocation fast.

3. Key data structures and algorithms

heapArena bitmap – a heapArenaBitmapBytes ‑byte array used by the GC to mark live objects. Each 8‑byte chunk holds two bits per 8 KB page (object‑present and GC‑marked).

pageInUse bitmap – one bit per page (8192 pages per arena) indicating whether the page is currently allocated.

mheap.grow – reserves a new 64 MB arena with sysReserve, updates the arenas array, and links the arena into the heap.

mheap.allocSpan – allocates a new mspan from the arena’s free pages, initializes its metadata, and returns it to the caller.

4. Summary of the design

Go’s memory allocator solves the classic problem of providing flexible, low‑fragmentation allocation on top of fixed‑size pages while keeping the fast‑path lock‑free for the common case. It does so by:

Batch‑requesting large virtual‑memory regions (64 MB arenas) with mheap, reducing system‑call overhead and ensuring contiguous address space.

Using per‑P mcache and mspan to emulate the kernel’s SLAB caches, allowing rapid allocation of small objects with acceptable internal waste.

Routing allocations larger than 32 KB directly to the page heap, bypassing the cache hierarchy.

The overall architecture combines ideas from the Linux kernel’s SLAB allocator and Google’s TCMalloc, adapting them to a garbage‑collected runtime that must manage both small and large objects efficiently.

memory management Go Runtime allocation SLAB mcache mheap

Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.