Fundamentals 29 min read

How Virtual Memory Works: From CPU Addressing to Linux Implementation

Virtual memory abstracts physical memory, giving each process a private, contiguous address space, and relies on CPU virtual addressing, MMU translation, page tables, TLB caching, multi-level paging, Linux’s memory mapping, dynamic allocation, and garbage collection to efficiently manage memory and protect processes.

Open Source Linux

May 22, 2023

How Virtual Memory Works: From CPU Addressing to Linux Implementation

Overview

Virtual memory provides each process with a consistent, private address space, creating the illusion of exclusive main‑memory usage. It is more than simply using disk space to extend RAM; it defines a continuous virtual address space that simplifies programming and protects processes.

It caches active regions of a virtual address space in main memory, loading pages on demand.

It gives each process a uniform address space, reducing programmer burden.

It prevents one process from corrupting another's address space.

The following sections describe how virtual memory operates in hardware and how Linux implements it.

CPU Addressing

Physical memory is an array of M bytes, each with a unique physical address (PA). The simplest method is physical addressing, where the CPU uses PA directly.

Modern processors use virtual addressing: the CPU must translate a virtual address to a physical address before accessing memory.

The translation is performed by the Memory Management Unit (MMU), which uses page tables stored in memory.

Page Table

Virtual memory is divided into fixed‑size virtual pages (VP) of P=2^p bytes, matching the size of physical pages (PP).

The page table, residing in physical memory, maps each virtual page to a physical page. Each entry (PTE) contains a valid bit indicating whether the page is cached in RAM.

When a program allocates memory (e.g., via malloc() or new), the OS creates a new virtual page on disk and updates the page table with a new PTE.

Additional permission bits can be added to PTEs; violations trigger a protection fault (segmentation fault).

Page Hit

If the PTE’s valid bit is 1, the virtual page is cached in RAM and the MMU obtains the physical address.

Page Fault

If the valid bit is 0, a page fault occurs. The OS selects a victim page, writes it back if dirty, loads the required virtual page into RAM, updates the PTE, and restarts the faulting instruction.

This on‑demand loading is called demand paging.

Multi‑Level Page Tables

For large address spaces (e.g., 32‑bit or 64‑bit), a single page table would be wasteful. Hierarchical page tables split the virtual address into multiple VPN fields, each indexing a level of the table.

In a two‑level scheme for a 32‑bit space, the first‑level table (1024 entries) maps 4 MB chunks, and each second‑level table maps 4 KB pages.

Address Translation Process

A virtual address consists of a virtual page number (VPN) and a page offset (VPO). The MMU uses the VPN to locate the appropriate PTE, extracts the physical page number (PPN), and concatenates it with the VPO to form the physical address.

With k‑level paging, the VPN is split into k parts, each indexing a level of the page table.

TLB

Because accessing the page table in memory is costly, CPUs use a Translation Lookaside Buffer (TLB) to cache recent PTEs.

On a TLB hit, the MMU obtains the PTE directly; on a miss, it fetches the PTE from memory and updates the TLB.

Linux Virtual Memory System

Linux gives each process its own virtual address space, split into kernel and user regions. The kernel maintains a mm_struct describing the current state, with a pointer to the top‑level page table (pgd) and a linked list of vm_area_struct describing each memory region. vm_start – start address of the region. vm_end – end address. vm_prot – protection flags. vm_flags – sharing and other attributes.

Memory Mapping

Linux can map a virtual region to a file on disk (file‑backed mapping) or to an anonymous zero‑filled region. The mapping is created lazily: physical pages are allocated only when the process first accesses the virtual address.

Mapped pages may be swapped to a swap file when memory pressure occurs.

Dynamic Memory Allocation

The heap is a collection of contiguous virtual chunks, each either allocated or free. Allocators maintain free lists (implicit or explicit) and use strategies such as first‑fit, next‑fit, or best‑fit to locate free blocks.

Segregated storage keeps multiple free‑list bins for different size classes to improve allocation speed.

Memory Fragmentation

Internal fragmentation occurs when an allocated block is larger than the requested payload; external fragmentation occurs when free space is split into many small blocks that cannot satisfy a request.

Garbage Collection

Languages with automatic memory management use garbage collectors to reclaim unreachable heap objects. Common techniques include reference counting and reachability analysis, with algorithms such as mark‑sweep, mark‑compact, copying, and generational collection.

Summary

Virtual memory abstracts physical memory, allowing each process a private address space. The CPU issues virtual addresses, the MMU translates them via TLB and page tables, handling hits, misses, and page faults. Linux combines virtual memory with file mapping, provides a heap managed by allocators, and supports garbage collection to reclaim unused memory.

TLB hit → fast PTE retrieval.

TLB miss → page‑table walk.

Page fault → load page from disk, update PTE.

References

CS:APP3e, Bryant and O'Hallaron

Virtual memory – Wikipedia

Garbage collection (computer science) – Wikipedia

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory Management Linux Virtual Memory operating system page-tables

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.