Fundamentals 27 min read

Understanding Linux Paging Mechanism and Virtual Memory Management

This article explains Linux's paging mechanism, covering the basics of virtual memory, page tables, multi‑level paging structures, virtual memory layout, allocation and reclamation strategies, and the performance and security benefits that paging brings to modern operating systems.

Deepin Linux

Jan 1, 2025

Understanding Linux Paging Mechanism and Virtual Memory Management

Have you ever wondered how a computer can run multiple programs, play music, edit documents, and download files simultaneously without memory conflicts? The answer lies in the paging mechanism, which divides both virtual address space and physical memory into equal‑sized pages and maps them with precise rules, giving each process the illusion of its own large address space while efficiently sharing physical memory.

1. Paging Overview

1.1 What is Paging?

In Linux, paging is the cornerstone of memory management. It splits physical memory into fixed‑size "rooms" called pages (typically 4 KB or 8 KB) and does the same for virtual memory, allowing each virtual page to be mapped to a physical page.

This approach simplifies kernel design, reduces fragmentation, and enables fast allocation and deallocation of memory pages.

Each process has its own page table, providing isolation and protection; the kernel can also share pages between processes when needed.

1.2 Why Does Paging Exist?

Even without paging, virtual memory can be implemented using segmentation, but variable‑length segments cause external fragmentation. Paging replaces variable‑size segments with fixed‑size pages, eliminating most fragmentation and allowing the processor’s hardware to handle address translation efficiently.

In short, paging was introduced primarily to solve memory‑fragmentation problems, not merely to enable virtual memory.

2. Core Component: Page Tables

2.1 Page Tables – The Bridge Between Virtual and Physical Memory

A page table records the mapping from each virtual page number to a physical page‑frame number, much like a detailed room‑number directory.

When a process accesses virtual address 0x1234, the address is split into a page number and offset (e.g., with a 4 KB page size, the high bits form the page number). The page table entry provides the corresponding physical frame (e.g., 0x56), and the offset locates the exact byte.

Each page‑table entry also contains control bits such as present, read/write, user/supervisor, accessed, dirty, and others that govern access permissions and caching behavior.

In a 4‑level paging scheme the entry types are:

PML4E (level 4 entry)

PDPTE (level 3 entry – page‑directory‑pointer table entry)

PDE (level 2 entry – page‑directory entry)

PTE (level 1 entry – page‑table entry)

Each entry includes a set of flag bits:

P (bit 0) – Present: indicates whether the page or table is in memory.

R/W (bit 1) – Read/Write permission.

U/S (bit 2) – User/Supervisor mode.

PWT (bit 3) – Page‑level Write‑Through.

PCD (bit 4) – Page‑level Cache Disable.

A (bit 5) – Accessed.

D (bit 6) – Dirty.

PS (bit 7) – Page Size (1 for large pages, 0 for pointers to lower‑level tables).

G (bit 8) – Global.

R (bit 9) – Reserved for restart (used only in HALT paging).

PAT (bit 7 or 12) – Page Attribute Table.

XD (bit 63) – Execute‑Disable.

2.2 Multi‑Level Page Tables

As address spaces grew, a single‑level page table became impractical. Multi‑level page tables break the virtual address into indices for each level (e.g., page‑directory index, page‑table index, offset), allowing the kernel to allocate lower‑level tables only for the portions of address space that are actually used, dramatically reducing memory overhead.

In 64‑bit systems, four levels (PML4 → PDPTE → PDE → PTE) are common, providing fine‑grained control over huge virtual address spaces.

3. Virtual Memory Layout (x86_64)

Under a 4‑level paging scheme, the Linux kernel defines several fixed regions in the virtual address space. The following excerpt from Documentation/x86/x86_64/mm.txt shows the layout:

// file: Documentation/x86/x86_64/mm.txt
 Virtual memory map with 4 level page tables:
 
 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
 hole caused by [48:63] sign extension
 ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
 ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
 ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
 ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
 ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
 ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
 ffffffff80000000 - ffffffffa0000000 (=512 MB)  kernel text mapping, from phys 0
 ffffffffa0000000 - ffffffffff5fffff (=1525 MB) module mapping space
 ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls
 ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole

The kernel also defines macros for the direct‑mapping region ( __PAGE_OFFSET) and the kernel text mapping region ( __START_KERNEL_map), as shown below:

// file: arch/x86/include/asm/page_64_types.h
 #define __PAGE_OFFSET           _AC(0xffff880000000000, UL)
 // file: arch/x86/include/asm/page_types.h
 #define PAGE_OFFSET     ((unsigned long)__PAGE_OFFSET)

 // file: arch/x86/include/asm/page_64_types.h
 #define __START_KERNEL_map  _AC(0xffffffff80000000, UL)

4. Paging in Practice: Memory Allocation and Reclamation

4.1 Fine‑Grained Allocation Strategies

The kernel allocates memory in page units. For small objects it uses the slab allocator, which pre‑divides a page into equal‑size caches, reducing fragmentation and speeding up allocation.

For larger requests (e.g., loading a big shared library) the kernel allocates contiguous pages directly, ensuring efficient access.

4.2 Memory Reclamation Trade‑offs

When free memory falls below a threshold, the kernel triggers reclamation. It may also reclaim proactively during suspend or heavy workload switches.

Page‑replacement algorithms such as FIFO, LRU, and Clock decide which pages to evict. FIFO is simple but can suffer from Belady’s anomaly; LRU tracks recent usage for better decisions at higher cost; Clock approximates LRU with a reference‑bit hand, offering a good balance.

5. Benefits of Paging

5.1 Improved Memory Utilization

Fixed‑size pages eliminate both internal and external fragmentation, allowing the kernel to pack memory tightly and combine free pages with buddy algorithms. Real‑world measurements show a 30‑50 % increase in usable memory for complex server workloads.

5.2 Process Isolation and Security

Each process has its own page table, giving it a private virtual address space. Even if two processes use the same virtual address, the underlying physical pages differ, preventing accidental or malicious cross‑process memory access.

This isolation is crucial for multi‑tenant servers, where a compromised process cannot corrupt the memory of others.

6. Linux Kernel Paging – Ongoing Evolution

From early simple paging to today’s multi‑level tables and sophisticated replacement policies, paging has continuously adapted to hardware advances and emerging workloads such as AI, big data, and cloud computing.

Future directions may include quantum‑aware paging algorithms or ultra‑lightweight schemes for IoT devices, ensuring that Linux’s memory management remains a cornerstone of modern computing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux page-tables Paging

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.