Why Linux Can Run Large Programs on Small Devices: A Deep Dive into Virtual Memory
This article explains how Linux virtual memory lets each process appear to have a large, independent address space by using paging, page tables, swap, and the MMU, covering concepts from basic virtual memory principles to page replacement algorithms and practical tools.
Linux virtual memory allows a process to think it has a huge, contiguous memory space even when physical RAM is limited, preventing crashes on old devices and enabling multitasking.
What Is Virtual Memory?
Virtual memory treats a portion of the hard‑disk as extra RAM. When physical memory fills up, inactive pages are moved to a swap area on disk, freeing RAM for active data. This "memory extender" works like a smart bookshelf that stores rarely used books in a basement and pulls them out when needed.
Why It Is Needed
Each process needs code and data faster than disk access can provide. Physical memory is often far smaller than a processor’s addressable space (e.g., a 32‑bit CPU can address 4 GB, but a system may have only 256 MB). Virtual memory solves this mismatch by giving every process its own 3 GB user space plus a shared kernel space, eliminating external fragmentation and allowing programs to run larger than the installed RAM.
Address Translation
The Memory Management Unit (MMU) translates virtual addresses to physical addresses using page tables. When a process accesses a page not in RAM, a page‑fault occurs, the kernel loads the page from swap or a file, updates the page table, and the process continues.
Physical vs. Virtual Memory
Physical memory is the actual RAM modules on the motherboard; virtual memory is an abstraction managed by the OS. The CPU always works with virtual addresses, while the MMU and page tables map them to real RAM locations.
Paging and Segmentation
Linux primarily uses paging: memory is divided into fixed‑size pages (commonly 4 KB). The page table records the mapping from virtual pages to physical frames. Some architectures also support segmentation, but Linux simplifies it to a few segments (code and data) and relies on paging for most management.
Page tables are hierarchical (e.g., PGD → PUD → PMD → PTE on x86_64). A virtual address is split into indices that walk these levels to locate the physical frame.
Memory Alignment
Data structures are aligned to their size (e.g., 4‑byte ints on 32‑bit systems) to improve access speed and avoid hardware faults. Misaligned accesses can cause extra memory reads and performance penalties.
Page Replacement Algorithms
When RAM is full, the kernel must evict pages. Common algorithms include:
LRU (Least Recently Used): evicts the page not accessed for the longest time.
FIFO (First‑In‑First‑Out): evicts the oldest loaded page.
LFU (Least Frequently Used): evicts the page with the lowest access count.
Implementation typically uses a doubly‑linked list and a hash table to achieve O(1) look‑ups and updates.
Swap Space
Swap can be a dedicated partition or a swap file. It provides overflow storage for pages that haven’t been used recently. While swap prevents out‑of‑memory crashes, excessive swapping degrades performance because disk I/O is much slower than RAM.
The swappiness kernel parameter (0‑100) controls how aggressively the kernel swaps; low values keep data in RAM, high values use swap sooner.
Process Virtual Address Layout
Each Linux process has a 4 GB (on 32‑bit) virtual address space divided into segments:
Text segment – read‑only executable code.
Data segment – initialized global/static variables.
BSS segment – zero‑filled uninitialized globals.
Heap – grows upward via malloc / brk / mmap.
Stack – grows downward, holds locals and return addresses.
Memory‑mapped region – files, shared libraries, anonymous mappings.
Memory Allocation Internals
The C library ( glibc) maintains a memory pool. Small allocations are served from this pool using brk to extend the heap; large allocations use mmap to obtain contiguous virtual memory, avoiding heap fragmentation.
MMU and TLB
The MMU performs address translation and enforces protection bits (read/write/execute). To speed up translation, a Translation Lookaside Buffer (TLB) caches recent page‑table entries. A TLB hit yields the physical address instantly; a miss triggers a page‑table walk.
Tools for Memory Inspection
Common commands: free – shows total, used, free, and swap memory. top / htop – interactive views of CPU, memory, and per‑process usage.
Debugging tools:
Valgrind – detects leaks, invalid reads/writes, and other memory errors in C/C++ programs.
Example Valgrind output highlights a lost 40‑byte allocation in a simple program, demonstrating how to locate and fix leaks.
Out‑of‑Memory (OOM) Handling
When memory is exhausted, the kernel may panic or invoke the OOM killer, which selects a victim process based on an oom_score (considering memory usage, swap usage, and page‑table size). The oom_score_adj can be tuned per‑process to protect critical services.
Preventive measures include optimizing memory usage, monitoring with top / htop, and adjusting swappiness and /proc/sys/vm/overcommit_memory to control allocation policies.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
