Fundamentals 14 min read

Unlocking Linux NUMA: How the Kernel Detects and Manages Non‑Uniform Memory Access

This article explains the hardware basis of NUMA, how Linux reads ACPI SRAT/SLIT tables to discover CPU‑memory topology, the kernel functions that initialize NUMA structures, and how tools like numactl can be used to optimize application performance on multi‑node servers.

IT Services Circle
IT Services Circle
IT Services Circle
Unlocking Linux NUMA: How the Kernel Detects and Manages Non‑Uniform Memory Access

During a Linux program’s execution, the Non‑Uniform Memory Access (NUMA) feature can significantly affect performance, and many organizations use numactl to bind services to specific NUMA nodes.

NUMA Overview

NUMA (Non‑Uniform Memory Access) means that a CPU core accesses memory attached to its own node faster than memory attached to other nodes. Modern CPUs integrate memory controllers, and servers can host multiple CPUs, each with its own memory slots. Accessing remote memory requires crossing inter‑CPU links such as UPI, introducing latency differences.

Servers often support 2, 4, or 8 CPUs, each with multiple memory channels. The physical distance between a CPU core and a memory module determines access latency, which is the essence of NUMA.

Running # numactl --hardware on a VM shows two nodes, each with its own CPUs and memory size, and the node distance matrix indicating higher latency across nodes.

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3
node 0 size: 7838 MB
node 0 free: 6208 MB
node 1 cpus: 4 5 6 7
node 1 size: 7934 MB
node 1 free: 6589 MB
node distances:
node   0   1
 0:  10  20
 1:  20  10

How Linux Reads NUMA Information

Kernel discovers memory nodes via ACPI

The firmware (UEFI/BIOS) provides ACPI tables that describe hardware topology. Linux reads the SRAT (System Resource Affinity Table) to map CPUs and memory to nodes, and the SLIT (System Locality Information Table) to obtain inter‑node distances.

During early boot, setup_arch() calls e820__memory_setup() and e820__memblock_setup(), then initmem_init() which eventually invokes acpi_numa_init() to parse the SRAT table.

//file:drivers/acpi/numa/srat.c
int __init acpi_numa_init(void)
{
    // parse SRAT table, extract CPU_AFFINITY, MEMORY_AFFINITY, etc.
    if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
        ...
    }
    ...
}

The parsed data is stored in a global numa_meminfo structure, which holds triples of (start address, end address, node id) for each memory block.

//file:arch/x86/mm/numa.c
static struct numa_meminfo numa_meminfo __initdata_or_meminfo;
struct numa_meminfo {
    int nr_blks;
    struct numa_memblk blk[NR_NODE_MEMBLKS];
};

Memblock allocator integrates NUMA data

After numa_meminfo is populated, numa_register_memblks() links each memblock region to its node, allocates a pglist_data object per node, and optionally dumps the memblock state for debugging.

//file:arch/x86/mm/numa.c
static int __init numa_register_memblks(struct numa_meminfo *mi)
{
    for (i = 0; i < mi->nr_blks; i++) {
        struct numa_memblk *mb = &mi->blk[i];
        memblock_set_node(mb->start, mb->end - mb->start,
                          &memblock.memory, mb->nid);
    }
    for_each_node_mask(nid, node_possible_map) {
        alloc_node_data(nid);
    }
    memblock_dump_all();
    return 0;
}

When the kernel later prints the memblock configuration, each region now includes the node identifier (e.g., “bytes on node 0”).

[    0.010796] MEMBLOCK configuration:
[    0.010797]  memory[0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0
[    0.010800]  memory[1] [0x0000000000100000-0x00000000bffd9fff], 0x00000000bfeda000 bytes on node 0
[    0.010801]  memory[2] [0x0000000100000000-0x000000023fffffff], 0x0000000140000000 bytes on node 0
[    0.010802]  memory[3] [0x0000000240000000-0x000000043fffffff], 0x0000000200000000 bytes on node 1
...

Summary

In modern servers, NUMA is crucial for understanding the topology of CPUs and memory modules. Linux obtains this topology by reading ACPI SRAT and SLIT tables, stores it in numa_meminfo, and equips the memblock allocator with node‑aware information. With the hardware NUMA map available, administrators can use numactl and related tools to bind processes to optimal nodes, improving performance, though improper binding may cause allocation issues in some scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Linux kernelNUMAMemory ArchitecturenumactlACPI
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.