Fundamentals 27 min read

How Virtual Memory Works: From CPU Addressing to Linux Implementation

This article explains the concepts and mechanisms of virtual memory, covering CPU virtual addressing, page tables, TLB caching, page faults, multi‑level page tables, Linux's memory‑mapping structures, and dynamic allocation strategies such as fragmentation and garbage collection.

Liangxu Linux

May 16, 2023

How Virtual Memory Works: From CPU Addressing to Linux Implementation

Overview

Processes share CPU and memory, so operating systems need robust memory‑management mechanisms. Virtual memory provides each process with a private, contiguous address space, simplifying programming and protecting processes from each other's memory.

CPU Addressing

Physical addressing uses direct physical addresses. Modern CPUs use virtual addressing, requiring the translation of virtual addresses to physical ones via the Memory Management Unit (MMU) and page tables.

Page Tables

Virtual memory is divided into fixed‑size virtual pages (VP) of size P=2^p bytes, mirrored by physical pages (PP) of the same size. The page table, stored in RAM, maps each VP to a PP via Page Table Entries (PTEs). A PTE’s valid bit indicates whether the virtual page is cached in physical memory.

Page Hit

If the PTE’s valid bit is 1, the virtual page is already in RAM and the MMU obtains the physical address directly.

Page Fault

If the valid bit is 0, a page‑fault exception transfers control to the kernel, which selects a victim page, writes it back if dirty, loads the required page from disk, updates the PTE, and restarts the faulting instruction, resulting in a successful translation.

Multi‑Level Page Tables

For large address spaces (e.g., 32‑bit or 64‑bit), a single page table is inefficient. Hierarchical page tables split the virtual address into multiple VPN fields, each indexing a level of the table, reducing memory consumption.

Address Translation Process

An n‑bit virtual address consists of a VPN and a VPO. The MMU uses the VPN to locate the appropriate PTE, extracts the physical page number (PPN), and concatenates it with the VPO to form the physical address. Multi‑level translation requires walking k PTEs.

TLB (Translation Lookaside Buffer)

To avoid frequent memory accesses for PTEs, the CPU caches recent PTEs in a TLB. On a TLB hit, translation is fast; on a miss, the required PTE is fetched from memory and stored in the TLB.

Linux Virtual‑Memory System

Linux gives each process a separate virtual address space divided into kernel and user regions (code, data, heap, libraries, stack). The kernel maintains mm_struct (overall state) and a linked list of vm_area_struct describing each region.

Memory Mapping

Linux can map a virtual region to a file (file‑backed) or to an anonymous zero‑filled region. Mapping is lazy: physical pages are allocated only when the process first accesses the virtual address.

Shared Objects

Memory‑mapped shared objects allow multiple processes to share the same physical pages, using copy‑on‑write for private writes.

Dynamic Memory Allocation

The heap is managed as a sequence of allocated and free blocks. Allocation strategies include first‑fit, next‑fit, and best‑fit, often organized into size‑class free lists (segregated storage) to speed up searches.

Fragmentation

Internal fragmentation occurs when an allocated block is larger than the requested payload; external fragmentation occurs when free space is split into many small blocks that cannot satisfy a request.

Garbage Collection

Automatic memory management uses techniques such as reference counting or reachability analysis, with algorithms like mark‑sweep, mark‑compact, copying, and generational collection to reclaim unused heap objects.

Summary

Virtual memory abstracts physical memory, requiring cooperation between CPU, MMU, and OS. Address translation involves TLB caching, page‑table walks, and handling page faults. Linux implements these concepts with per‑process page tables, memory‑mapped regions, and dynamic allocation mechanisms, while modern languages add automatic garbage collection to manage heap memory.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory Management Linux TLB page-tables Paging

Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.