Why Page Faults Occur and How MMU & TLB Resolve Them
This article explains the concepts of virtual and physical addresses, lazy memory allocation, the roles of the MMU, page tables, and TLB in address translation, and details the causes, classifications, and handling mechanisms of page faults in Linux systems.
1. Introduction
The author greets the readers and uses a sample interview question to introduce the main topic: the page‑fault exception.
2. Terminology
VA : Virtual Address
PA : Physical Address
MMU : Memory Management Unit
TLB : Translation Lookaside Buffer
PTE : Page Table Entry
3. Lazy Allocation of Memory
In a 32‑bit Linux system each process has a 4 GB virtual address space, but only a small portion is backed by physical memory. The same principle applies to 64‑bit systems, where the virtual space is divided into regions. Physical pages are allocated only when a program actually accesses a virtual page, implementing a lazy (deferred) allocation mechanism.
1. Linux virtual address space is a "blank check"—large in size but only a tiny fraction maps to real physical memory. 2. Lazy allocation improves memory and server utilization. 3. Quickly and accurately translating many virtual addresses to physical addresses is a challenging problem.
4. How the CPU Retrieves Data
The CPU does not interact directly with physical memory; it delegates address translation to the MMU, a fast hardware circuit that performs memory management tasks.
4.1 MMU and Page Table
Each process has its own page table that stores the VA‑to‑PA mappings. When the MMU receives a virtual address, it looks up the page table to verify the mapping and access permissions. For a 4 GB address space with 4 KB pages, a single‑level page table would contain 2²⁰ entries, requiring large contiguous memory. Multi‑level page tables reduce memory consumption but increase lookup steps.
1. Page‑table levels reduce contiguous memory requirements but add lookup overhead.
4.2 MMU and TLB
The TLB is a small cache that stores recent page‑table entries. On a memory access, the MMU first checks the TLB; a hit yields the physical address immediately. On a miss, the MMU falls back to the full page table, obtains the mapping, and updates the TLB. When the TLB becomes full, it evicts older entries using a replacement policy.
1. The TLB accelerates address translation; a miss triggers a page‑table lookup and TLB update.
5. Page‑Fault Deep Dive
5.1 What Is a Page Fault?
A page fault is a hardware‑generated interrupt that occurs when the MMU cannot find a valid physical page for a given virtual address or the access permissions are insufficient. The CPU then switches to kernel mode and invokes the Page Fault Handler.
5.2 Classification of Page Faults
Hard (Major) Page Fault : No corresponding physical frame exists; the OS must read the page from disk or swap into RAM.
Soft (Minor) Page Fault : The required frame exists (e.g., shared memory) but is not yet mapped for the faulting process; the OS only needs to establish the mapping.
Invalid Page Fault : The access is illegal, such as out‑of‑bounds or null‑pointer dereference, leading to a segmentation fault.
5.3 Common Causes
Illegal out‑of‑bounds access or permission violation.
Lazy allocation via malloc where physical memory is only allocated upon first use.
Accessing pages that have been swapped out to disk.
1. Soft faults arise when a page is present but unmapped; the OS only updates the page table or TLB. 2. Hard faults require disk I/O, which is slow; using fast SSDs can mitigate latency. 3. Invalid faults trigger SIGSEGV and terminate the process.
6. Full Summary
The article provides an overview of page‑fault mechanisms, covering the relationship between virtual and physical addresses in Linux, the CPU’s reliance on the MMU, the cooperation between page tables and the TLB, and the reasons and classifications of page faults. It does not delve into kernel‑mode handling details or the internal design of the MMU, focusing instead on the high‑level concepts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
