How Linux Translates Memory Addresses: Segmentation and Paging Explained
This article explains Linux memory addressing by covering logical, virtual, and physical addresses, the role of the MMU, segmentation and paging mechanisms, hardware and Linux-specific segment structures, and the 4‑level page‑table system that maps virtual memory to physical memory.
Memory Addresses
Logical address
A logical address is the address used inside machine‑code instructions. It consists of a segment selector and an offset within that segment.
Segment selector – a 16‑bit value that identifies a descriptor entry in a descriptor table.
Offset – the distance from the start of the segment to the target byte.
Virtual address
On a 32‑bit CPU the virtual address space is 4 GiB, represented as an unsigned 32‑bit integer. Addresses are usually shown in hexadecimal, ranging from 0x00000000 to 0xffffffff.
Physical address
The physical address is the location on the memory chips, expressed as a 32‑ or 36‑bit unsigned integer.
Memory Management Unit (MMU)
The MMU, integrated in the CPU, performs two‑stage translation:
Segmentation : logical → virtual.
Paging : virtual → physical.
Memory Arbiter (MA)
In multicore systems the MA serialises memory accesses because the memory bus can service only one read/write at a time.
If the bus is idle, the access proceeds immediately.
If the bus is busy, the CPU request is delayed until the bus becomes free.
Purpose of segmentation and paging
Segmentation gives each process its own virtual address space; paging maps that virtual space onto physical frames.
Memory Segmentation
Hardware segmentation
Real mode and protected mode
Real mode – provides backward compatibility with early 8086 hardware. Physical address is calculated as (segment << 4) + offset, yielding a 20‑bit address (max 1 MiB).
Protected mode – uses 32/64‑bit registers, adds privilege levels, and enables the full segmentation and paging mechanisms.
Segment selector and segment registers
A logical address is formed as selector (16 bits) + offset (32 bits). The selector fields are:
Index – descriptor table entry index (13 bits).
TI – Table Indicator (0 = GDT, 1 = LDT).
RPL – Requested Privilege Level (0‑3).
Segment registers ( cs, ds, es, fs, gs, ss) hold selectors; cs also encodes the current privilege level (CPL).
Segment descriptor
Each segment is described by an 8‑byte descriptor stored in the Global Descriptor Table (GDT) or Local Descriptor Table (LDT). Key fields:
Base – linear address of the segment start.
Limit – size of the segment; interpreted in bytes if G =0, otherwise in 4 KiB units.
G – granularity flag.
S – system flag (0 = system segment, 1 = code/data segment).
Type – segment type and access rights.
DPL – descriptor privilege level (minimum CPL required to access).
P – present flag (must be 1 for a valid segment).
D/B – default operation size (0 = 16‑bit, 1 = 32‑bit for code; 0 = byte‑granular, 1 = 4 KiB‑granular for data).
AVL – available for OS use (ignored by Linux).
Fast descriptor access
The selector index is 13 bits, allowing up to 2^13‑1 descriptors. The descriptor address is computed as GDT_base + (index << 3).
Linux segmentation
On x86 Linux the segmentation tables are set up so that all segment bases start at 0x00000000. Consequently logical and virtual addresses are identical.
Selectors are defined by macros: __USER_CS, __USER_DS – user‑mode code and data selectors. __KERNEL_CS, __KERNEL_DS – kernel‑mode code and data selectors.
The Current Privilege Level (CPL) is stored in the RPL field of cs. Changing CPL requires updating the relevant segment registers.
Memory Paging
Hardware paging
The paging unit translates a virtual address to a physical address, checks access rights, and raises a page‑fault on illegal accesses.
A page is a fixed‑size block of virtual addresses (commonly 4 KiB). A page frame is the corresponding block of physical memory. Page tables map virtual pages to physical frames.
Linux paging
Linux uses a four‑level page‑table hierarchy indexed by the CR3 register:
Page Global Directory (PGD)
Page Upper Directory (PUD)
Page Middle Directory (PMD)
Page Table (PT)
Each level contributes 9 bits of the virtual address (for 4 KiB pages, the lowest 12 bits are the offset). The translation proceeds by reading the entry at each level, ultimately yielding a physical page number (PPN) which is combined with the offset to form the final physical address.
Advantages of paging
Isolates each process's address space, preventing accidental overwrites.
Allows pages to reside in any physical frame, enabling virtual‑memory techniques such as swapping.
Reduces the amount of memory needed for page tables by allocating lower‑level tables lazily.
During a context switch the kernel saves the current process's CR3 value in its task descriptor and loads the next process's page‑global‑directory base into CR3.
Process page tables
In a typical 32‑bit Linux system:
User‑space addresses: 0x00000000 – 0xbfffffff.
Kernel‑space addresses: 0xc0000000 – 0xffffffff.
When running in user mode, generated linear addresses are below 0xc0000000; kernel mode can use the full range.
Kernel page tables
The kernel has its own set of page tables stored in the top entries of the global page directory. These entries serve as a reference model for all user processes, providing a shared mapping of kernel code and data.
Why multi‑level page tables
A flat page table for a 4 GiB address space with 4 KiB pages would require 1 Mi entries (≈4 MiB). Multi‑level tables reduce memory consumption because only the portions of the address space that are actually used need to be allocated.
Linux selects a 4‑level hierarchy on 64‑bit CPUs to balance lookup speed, memory usage, and support for large address spaces.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
