Fundamentals 11 min read

Understanding Linux Kernel Memory: Nodes, Zones, Buddy System, and SLAB Allocator

This article explains how Linux 3.10 organizes memory using NUMA nodes, zones, the buddy system, and the SLAB allocator, providing commands, code examples, and visual diagrams to illustrate each layer of the kernel's efficient memory management.

ITPUB
ITPUB
ITPUB
Understanding Linux Kernel Memory: Nodes, Zones, Buddy System, and SLAB Allocator

1. Node Division (NUMA)

Modern servers use NUMA architecture where each CPU socket and its directly attached memory form a node . The dmidecode command can list CPU details and memory modules, showing which DIMM belongs to which CPU. Example output:

Processor Information  //第一颗CPU
    SocketDesignation: CPU1
    Version: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
    Core Count: 8
    Thread Count: 16
Processor Information  //第二颗CPU
    Socket Designation: CPU2
    Version: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
    Core Count: 8

Memory modules can be inspected similarly, revealing four DIMMs per CPU on the example machine. The numactl --hardware command displays each node's CPUs and memory size:

numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 65419 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 65536 MB

2. Zone Division

Each node is further split into zones , which are contiguous memory ranges. Common zones on x86‑64 are:

ZONE_DMA – low‑address range for ISA DMA devices.

ZONE_DMA32 – for 32‑bit DMA devices, only present on 64‑bit kernels.

ZONE_NORMAL – all remaining memory managed by the kernel.

ZONE_HIGHMEM exists only on 32‑bit systems and is rarely used today.

A zone contains many pages , each typically 4 KB. The /proc/zoneinfo file shows per‑zone page statistics:

# cat /proc/zoneinfo
Node 0, zone      DMA
    pages free     3973
        managed  3973
Node 0, zone    DMA32
    pages free     390390
        managed  427659
Node 0, zone   Normal
    pages free     15021616
        managed  15990165
Node 1, zone   Normal
    pages free     16012823
        managed  16514393

Multiplying the number of free pages by 4 KB yields the zone size (e.g., Node 1 Normal zone ≈ 66 GB).

3. Buddy System for Free Page Management

The kernel represents each zone with a struct zone. Its free_area array (size MAX_ORDER = 11) holds free page lists for block sizes 4 KB, 8 KB, …, 4 MB.

//file: include/linux/mmzone.h
#define MAX_ORDER 11
struct zone {
    free_area   free_area[MAX_ORDER];
    ...
}

The alloc_pages(gfp_mask, order) function searches these lists to allocate a contiguous block. For example, allocating an 8 KB block (order = 1) involves finding two adjacent free pages.

struct page * alloc_pages(gfp_t gfp_mask, unsigned int order)
In the buddy system, a "buddy" is a pair of equal‑size, contiguous blocks that belong to the same larger region.

4. SLAB Allocator

While the buddy system works with whole pages, many kernel objects are much smaller. The SLAB (or SLUB) allocator sits on top of the buddy system and manages caches of objects of a fixed size.

Each kmem_cache has three linked lists: partial , full , and free . A slab consists of one or more pages and stores objects of the same size.

//file: include/linux/slab_def.h
struct kmem_cache {
    struct kmem_cache_node **node
    ...
}

//file: mm/slab.h
struct kmem_cache_node {
    struct list_head slabs_partial;
    struct list_head slabs_full;
    struct list_head slabs_free;
    ...
}

When a cache needs more memory, it calls kmem_getpages, which ultimately invokes alloc_pages_exact_node (a wrapper around __alloc_pages) to obtain whole pages from the buddy system.

//file: mm/slab.c
static void *kmem_getpages(struct kmem_cache *cachep,
            gfp_t flags, int nodeid)
{
    ...
    flags |= cachep->allocflags;
    if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
        flags |= __GFP_RECLAIMABLE;
    page = alloc_pages_exact_node(nodeid, ...);
    ...
}

//file: include/linux/gfp.h
static inline struct page *alloc_pages_exact_node(int nid,
            gfp_t gfp_mask, unsigned int order)
{
    return __alloc_pages(gfp_mask, order, node_zonelist(nid, gfp_mask));
}

Typical kernel caches (e.g., TCP socket structures) are visible via /proc/slabinfo or the slabtop command. The output includes objsize (object size) and objperslab (objects per slab), plus pagesperslab to compute memory consumption.

# cat /proc/slabinfo | grep TCP
TCP              288    384   1984   16    8

Interpretation: each TCP slab occupies 8 pages (8 × 4 KB = 32 KB); each object is 1984 bytes; a slab holds 16 objects (1984 × 16 ≈ 31.7 KB), leaving about 1 KB unused, which is acceptable given the low fragmentation and high performance of the SLAB mechanism.

5. Summary

The Linux kernel combines several layers—NUMA nodes, zones, the buddy system, and the SLAB allocator—to manage memory efficiently. Nodes and zones provide a hierarchical view of physical memory, the buddy system handles page‑level allocation, and the SLAB allocator reduces fragmentation for small kernel objects, delivering high performance for both user‑space and kernel‑space allocations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Memory ManagementKernelLinuxNUMAbuddy systemSlab Allocator
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.