How Virtual Memory Works: From CPU Addressing to Linux Implementation
This article explains the concepts and mechanisms of virtual memory, covering CPU virtual addressing, page tables, TLB caching, page faults, multi‑level page tables, Linux's memory‑mapping structures, and dynamic allocation strategies such as fragmentation and garbage collection.
Overview
Processes share CPU and memory, so operating systems need robust memory‑management mechanisms. Virtual memory provides each process with a private, contiguous address space, simplifying programming and protecting processes from each other's memory.
CPU Addressing
Physical addressing uses direct physical addresses. Modern CPUs use virtual addressing, requiring the translation of virtual addresses to physical ones via the Memory Management Unit (MMU) and page tables.
Page Tables
Virtual memory is divided into fixed‑size virtual pages (VP) of size P=2^p bytes, mirrored by physical pages (PP) of the same size. The page table, stored in RAM, maps each VP to a PP via Page Table Entries (PTEs). A PTE’s valid bit indicates whether the virtual page is cached in physical memory.
Page Hit
If the PTE’s valid bit is 1, the virtual page is already in RAM and the MMU obtains the physical address directly.
Page Fault
If the valid bit is 0, a page‑fault exception transfers control to the kernel, which selects a victim page, writes it back if dirty, loads the required page from disk, updates the PTE, and restarts the faulting instruction, resulting in a successful translation.
Multi‑Level Page Tables
For large address spaces (e.g., 32‑bit or 64‑bit), a single page table is inefficient. Hierarchical page tables split the virtual address into multiple VPN fields, each indexing a level of the table, reducing memory consumption.
Address Translation Process
An n‑bit virtual address consists of a VPN and a VPO. The MMU uses the VPN to locate the appropriate PTE, extracts the physical page number (PPN), and concatenates it with the VPO to form the physical address. Multi‑level translation requires walking k PTEs.
TLB (Translation Lookaside Buffer)
To avoid frequent memory accesses for PTEs, the CPU caches recent PTEs in a TLB. On a TLB hit, translation is fast; on a miss, the required PTE is fetched from memory and stored in the TLB.
Linux Virtual‑Memory System
Linux gives each process a separate virtual address space divided into kernel and user regions (code, data, heap, libraries, stack). The kernel maintains mm_struct (overall state) and a linked list of vm_area_struct describing each region.
Memory Mapping
Linux can map a virtual region to a file (file‑backed) or to an anonymous zero‑filled region. Mapping is lazy: physical pages are allocated only when the process first accesses the virtual address.
Shared Objects
Memory‑mapped shared objects allow multiple processes to share the same physical pages, using copy‑on‑write for private writes.
Dynamic Memory Allocation
The heap is managed as a sequence of allocated and free blocks. Allocation strategies include first‑fit, next‑fit, and best‑fit, often organized into size‑class free lists (segregated storage) to speed up searches.
Fragmentation
Internal fragmentation occurs when an allocated block is larger than the requested payload; external fragmentation occurs when free space is split into many small blocks that cannot satisfy a request.
Garbage Collection
Automatic memory management uses techniques such as reference counting or reachability analysis, with algorithms like mark‑sweep, mark‑compact, copying, and generational collection to reclaim unused heap objects.
Summary
Virtual memory abstracts physical memory, requiring cooperation between CPU, MMU, and OS. Address translation involves TLB caching, page‑table walks, and handling page faults. Linux implements these concepts with per‑process page tables, memory‑mapped regions, and dynamic allocation mechanisms, while modern languages add automatic garbage collection to manage heap memory.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
