Fundamentals 12 min read

Inside Go’s Runtime: How mcache and mheap Manage Memory

This article provides a detailed technical analysis of Go's runtime memory management, covering the initialization of the mheap structure, small‑object allocation via mcache, large‑object handling, the three‑color mark‑and‑sweep garbage collector, memory release mechanisms, and the optimization techniques that coordinate mcache and mheap for efficient concurrent execution.

Java Architecture Stack
Java Architecture Stack
Java Architecture Stack
Inside Go’s Runtime: How mcache and mheap Manage Memory

Memory Management Initialization

Source file: runtime/malloc.go mheap initialization : The global heap mheap_ is created by calling sysAlloc to obtain a large contiguous memory region from the operating system.

mcache initialization : Each logical processor (P) gets a thread‑local mcache that caches small objects and reduces contention on the global heap.

func mallocinit() {
    mheap_.init() // initialize global heap
}

func allocmcache() *mcache {
    c := new(mcache)
    c.refill() // pre‑fill tiny cache
    return c
}

Memory Allocation

Go distinguishes between small objects (≤ 32 KB) and large objects (> 32 KB).

Small‑object allocation (≤ 32 KB)

Uses the per‑P mcache, which holds spans for each size class defined in sizeclasses.

Allocation calls mcache.alloc. If the cache is empty, a span is fetched from mheap.

func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
    if size <= maxSmallSize {
        c := getmcache() // current P's mcache
        s := c.alloc(size, needzero)
        return s
    }
    // large objects handled below
}

Large‑object allocation (> 32 KB)

Allocates a contiguous span directly from mheap via allocSpan.

func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
    if size > maxSmallSize {
        s := mheap_.allocSpan(size)
        return s
    }
    // small objects handled above
}

Garbage Collection (GC)

Source file: runtime/mgc.go The Go GC implements a three‑color mark‑and‑sweep algorithm with two main phases:

Mark phase : Traverses roots (globals, stack, registers) and marks reachable objects.

Sweep phase : Reclaims unmarked objects, returning memory to mcache or mheap.

func gcSweep() {
    for _, s := range mheap_.spans {
        s.sweep()
    }
}

Memory Release

Source file: runtime/malloc.go Large spans are returned to the OS with sysUnused or sysFree.

Small objects are placed back into the per‑P mcache; when a cache is full, spans are moved to the global mheap free list.

Optimization Mechanisms

Thread‑local cache (mcache)

Eliminates global lock contention for small allocations.

Provides fast allocation from cached spans.

Memory alignment

Allocated addresses are aligned to the object size (e.g., 8‑byte, 16‑byte) for efficient access.

Free list

Freed spans are placed on per‑size‑class free lists for quick reuse.

GC trigger condition

GC runs when heap growth exceeds a configurable ratio (default 100%).

mcache Module

Data structure

type mcache struct {
    alloc        [numSpanClasses]*mspan // one span per size class
    tiny         uintptr               // tiny object cache base address
    tinyoffset   uintptr               // offset within tiny cache
    local_nlookup uintptr              // number of local allocations
    // ... other fields omitted ...
}

Key fields: alloc: Stores spans indexed by size class for reuse. tiny and tinyoffset: Support tiny allocations (≤ 32 KB). local_nlookup: Counts allocations performed by this cache.

Allocation method

func (c *mcache) alloc(size uintptr, needzero bool) unsafe.Pointer {
    sc := sizeToClass(size) // map size to size class
    s := c.alloc[sc]
    if s == nil || s.freeindex == s.nelems { // cache miss
        s = mheap_.allocSpan(sc) // fetch from global heap
        if s == nil {
            throw("out of memory")
        }
        c.alloc[sc] = s
    }
    // ... allocate object from span ...
    return obj
}

Release method

func (c *mcache) releaseAll() {
    for i := range c.alloc {
        s := c.alloc[i]
        if s != nil {
            mheap_.freeSpan(s) // return to global heap
            c.alloc[i] = nil
        }
    }
}

mheap Module

Data structure

type mheap struct {
    spans    []*mspan               // all allocated spans indexed by page
    freelist [numSpanClasses]*mspan // per‑class free list of spans
    arenas   [maxArenas]*heapArena // underlying memory regions
    lock     mutex                  // protects concurrent heap operations
    // ... other fields omitted ...
}

Key fields: spans: Global list of all spans. freelist: Per‑size‑class list of free spans. arenas: Low‑level memory regions obtained from the OS. lock: Global mutex for heap operations.

Span allocation

func (h *mheap) allocSpan(sc spanClass) *mspan {
    lock(&h.lock)
    s := h.freelist[sc]
    if s != nil {
        h.freelist[sc] = s.next
        unlock(&h.lock)
        return s
    }
    unlock(&h.lock)
    return h.grow(sc) // expand from arenas when free list empty
}

Span reclamation

func (h *mheap) freeSpan(s *mspan) {
    lock(&h.lock)
    sc := s.spanclass()
    s.reset()
    s.next = h.freelist[sc]
    h.freelist[sc] = s
    unlock(&h.lock)
}

Heap growth

func (h *mheap) grow(sc spanClass) *mspan {
    p := sysAlloc(_PageSize * npage, &memstats.heap_sys) // OS memory
    if p == nil {
        throw("out of memory")
    }
    s := newMSpan()
    s.init(p, npage)
    return s
}

Collaboration Workflow

Allocation

Small objects: mcache provides fast allocation from cached spans.

Large objects: Directly allocated from mheap via allocSpan.

Reclamation

When a cache becomes empty or during GC, mcache.releaseAll returns unused spans to mheap 's free list.

GC involvement

GC triggers mcache.releaseAll and mheap.freeSpan to recycle dead objects and spans.

Summary

mcache

is a per‑P local cache that speeds up small‑object allocation and reduces lock contention. mheap is the global heap manager handling large allocations, span allocation, and overall memory reclamation.

The two structures cooperate through shared spans, balancing performance and memory utilization.

The garbage collector periodically marks reachable objects and sweeps unreachable ones, returning memory to mcache or mheap, and large spans may be released back to the operating system via sysUnused / sysFree.

memory managementconcurrencyGoRuntimeGarbage Collectionmcachemheap
Java Architecture Stack
Written by

Java Architecture Stack

Dedicated to original, practical tech insights—from skill advancement to architecture, front‑end to back‑end, the full‑stack path, with Wei Ge guiding you.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.