Fundamentals 99 min read

Inside Linux Physical Memory Management: From FLATMEM to NUMA, Watermarks, and Page Structures

This article provides an in‑depth, step‑by‑step explanation of how the Linux kernel organizes and manages physical memory, covering memory models (FLATMEM, DISCONTIGMEM, SPARSEMEM), NUMA vs. UMA architectures, zone partitioning, watermarks, reserved pages, hot‑cold page handling, and the detailed struct page layout used for both anonymous and file‑backed pages.

Bin's Tech Cabin

Nov 21, 2022

Inside Linux Physical Memory Management: From FLATMEM to NUMA, Watermarks, and Page Structures

1. Physical Memory Models

The kernel supports three models: FLATMEM (contiguous memory managed by a global mem_map array), DISCONTIGMEM (multiple node_mem_map arrays for non‑contiguous regions), and SPARSEMEM (section‑based management for sparse, hot‑plugged memory).

2. Memory Architectures

Two architectures are described: UMA (Uniform Memory Access) where all CPUs share a single memory node, and NUMA (Non‑Uniform Memory Access) where each CPU has a local memory node and remote accesses incur higher latency.

3. NUMA Nodes and Zones

Each NUMA node is represented by struct pglist_data, which contains an array of struct zone objects (ZONE_DMA, ZONE_DMA32, ZONE_NORMAL, ZONE_HIGHMEM, ZONE_MOVABLE). Zones hold per‑node memory statistics and a buddy allocator ( free_area array) for page allocation.

4. Reserved Memory and Low‑Memory Reserves

Zones keep a reserved pool ( nr_reserved_highatomic) for critical kernel operations and a low‑memory reserve calculated from /proc/sys/vm/lowmem_reserve_ratio to prevent high‑order zones from starving low‑order zones.

5. Watermarks (WMARK_MIN, WMARK_LOW, WMARK_HIGH)

Watermarks define three memory thresholds based on min_free_kbytes. When free pages fall below WMARK_LOW, the kswapd daemon is woken; below WMARK_MIN direct reclaim occurs. The values are computed from total managed pages and can be tuned via /proc/sys/vm/min_free_kbytes and /proc/sys/vm/watermark_scale_factor.

6. Hot and Cold Pages

Pages are placed on LRU lists: active (hot) and inactive (cold). Separate lists exist for anonymous and file‑backed pages, allowing the kernel to prioritize reclaim based on the swappiness setting. Pages also have per‑CPU caches to reduce allocation latency.

7. struct page Overview

The kernel describes every physical page with struct page. Key fields include: flags – status bits (e.g., PG_locked, PG_dirty, PG_active, PG_lru). mapping – points to struct address_space for file pages or to an encoded struct anon_vma for anonymous pages (low bit distinguishes the two). index – page offset within the file cache or offset inside a VMA for anonymous pages. _mapcount – number of VMA mappings to this page. _refcount – kernel references to the page. lru – list head linking the page to the appropriate LRU list.

Compound (huge) pages are built from multiple contiguous pages; the head page has PG_head set and stores compound_order, compound_dtor, and reference counters. Tail pages point back to the head via compound_head.

8. Slab Allocation

Small kernel objects (e.g., anon_vma, vm_area_struct) are allocated from slab caches. Each slab page embeds a struct kmem_cache pointer, a freelist, and usage counters ( inuse, objects).

9. Anonymous Page Reverse Mapping

Anonymous pages use struct anon_vma and struct anon_vma_chain to map a physical page back to all VMAs that reference it. The anon_vma holds a red‑black tree of anon_vma_chain entries, each linking to a specific vm_area_struct. This enables fast lookup when a page must be reclaimed or migrated.

10. Practical Tools

Useful commands to inspect the state: cat /proc/zoneinfo – shows per‑zone free pages, watermarks, and LRU counts. numactl -H – displays NUMA node layout and distances. cat /proc/sys/vm/* – view tunable parameters such as min_free_kbytes, swappiness, and watermark_scale_factor.

Conclusion

The Linux kernel combines hierarchical structures (nodes → zones → pages), flexible memory models, and sophisticated reclaim mechanisms to efficiently manage physical memory on modern NUMA systems while providing fast access for both file‑backed and anonymous pages.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

memory management Linux NUMA Page Allocation

Written by

Bin's Tech Cabin

Original articles dissecting source code and sharing personal tech insights. A modest space for serious discussion, free from noise and bureaucracy.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.