Understanding Linux Kernel Reverse Mapping (RMAP): Concepts, Data Structures, and Implementation
This article explains the Linux kernel reverse‑mapping (RMAP) mechanism, covering its historical background, core concepts, key data structures such as anon_vma and anon_vma_chain, and the detailed workflow for page creation, fork handling, and page reclaim or migration, with full code examples.
1. The Past and Present of Linux Memory Management
Memory management has been a core function of operating systems, evolving from simple contiguous physical allocation to sophisticated virtual memory schemes in Linux. Early systems suffered from security risks, low utilization, and fragmentation, which led to the introduction of virtual memory, segmentation, and paging to improve isolation and efficiency.
As Linux matured, the need to quickly locate the mapping between physical and virtual pages became critical, especially in multi‑process environments. In the 2.4 kernel, determining whether a physical page was mapped required scanning every process's page tables, which was extremely inefficient.
During the development of the 2.5 kernel, the Reverse Mapping (RMAP) concept was introduced to dramatically improve this lookup performance.
2. What is RMAP?
RMAP (Reverse Mapping) is a kernel mechanism that provides a fast way to find all virtual addresses that map to a given physical page, the opposite direction of the usual virtual‑to‑physical mapping.
In page‑reclaim scenarios, RMAP allows the kernel to quickly locate and unmap all user PTEs that reference a particular anonymous page, greatly speeding up memory reclamation.
Similarly, during page migration (e.g., in NUMA systems), RMAP enables the kernel to find all affected PTEs, invalidate them, allocate a new page on the target node, and update the mappings.
2.1 Forward Mapping
When a process allocates memory and triggers a page fault, the kernel creates a virtual‑to‑physical mapping (forward mapping).
2.2 Reverse Mapping
Reverse mapping finds all virtual pages that map to a specific physical page.
2.3 Background of RMAP
Each physical page keeps a reference count (_mapcount) indicating how many user PTEs map to it. Anonymous pages, which are not backed by files, rely on RMAP to efficiently locate and manage these mappings.
3. Key Data Structures of RMAP
RMAP relies on two main structures: anon_vma (AV) and anon_vma_chain (AVC), plus the standard vm_area_struct (VMA).
3.1 struct anon_vma (AV)
The anon_vma structure links a physical page to the virtual memory areas (VMAs) that reference it.
struct anon_vma {
struct anon_vma *root; /* Root of this anon_vma tree */
struct rw_semaphore rwsem; /* W: modification, R: walking the list */
atomic_t refcount; /* Number of VMAs referencing this anon_vma */
unsigned degree; /* Count of child anon_vmas and VMAs */
struct anon_vma *parent; /* Parent of this anon_vma */
struct rb_root_cached rb_root; /* Interval tree of related VMAs */
};The refcount field tracks how many VMAs are attached; when it reaches zero the anon_vma can be reclaimed.
3.2 struct anon_vma_chain (AVC)
The anon_vma_chain bridges a VMA and its anon_vma, allowing fast traversal from either side.
struct anon_vma_chain {
struct vm_area_struct *vma; // Pointer to the VMA
struct anon_vma *anon_vma; // Pointer to the anon_vma
struct list_head same_vma; // Links all chains for the same VMA
struct rb_node rb; // Node for insertion into anon_vma's RB tree
unsigned long rb_subtree_last;
#ifdef CONFIG_DEBUG_VM_RB
unsigned long cached_vma_start, cached_vma_last;
#endif
};3.3 struct vm_area_struct (VMA)
The VMA describes a contiguous virtual memory region within a process and holds pointers to its anon_vma and the chain list.
struct vm_area_struct {
unsigned long vm_start; // Start address of the region
unsigned long vm_end; // End address of the region
struct mm_struct *vm_mm; // Owning memory descriptor
struct vm_area_struct *vm_next; // Next VMA in the list
pgprot_t vm_page_prot; // Page protection flags
unsigned long vm_flags; // VMA flags
struct list_head anon_vma_chain; // List of anon_vma_chain nodes
struct anon_vma *anon_vma; // Pointer to the associated anon_vma
// ... other fields ...
};When a process allocates anonymous memory (e.g., via malloc ), the kernel creates a VMA, allocates an anon_vma , and links them through an anon_vma_chain . This linkage enables later operations such as reclaim or migration to quickly locate all related VMAs.
4. RMAP Workflow and Principles
4.1 Anonymous Page Creation and RMAP Initialization
When a page fault occurs for an unmapped virtual address, the kernel calls do_anonymous_page . This function prepares the anon_vma, allocates a zero‑filled physical page, and registers the reverse mapping.
static vm_fault_t do_anonymous_page(struct vm_fault *vmf) {
struct vm_area_struct *vma = vmf->vma;
struct page *page;
vm_fault_t ret = 0;
pte_t entry;
// Prepare anon_vma
if (unlikely(anon_vma_prepare(vma)))
goto oom;
// Allocate physical page
page = alloc_zeroed_user_highpage_movable(vma, vmf->address);
if (!page)
goto oom;
// Add new anonymous reverse mapping
page_add_new_anon_rmap(page, vma, vmf->address, false);
// ... other handling ...
return ret;
oom:
return VM_FAULT_OOM;
}The helper anon_vma_prepare finds or creates an appropriate anon_vma and links an anon_vma_chain to the VMA.
int __anon_vma_prepare(struct vm_area_struct *vma) {
struct anon_vma *anon_vma, *allocated;
struct anon_vma_chain *avc;
// Allocate anon_vma_chain
avc = anon_vma_chain_alloc(GFP_KERNEL);
anon_vma = find_mergeable_anon_vma(vma);
allocated = NULL;
if (!anon_vma) {
// Allocate a new anon_vma
anon_vma = anon_vma_alloc();
allocated = anon_vma;
}
anon_vma_lock_write(anon_vma);
spin_lock(&mm->page_table_lock);
if (likely(!vma->anon_vma)) {
vma->anon_vma = anon_vma;
// Link the chain
anon_vma_chain_link(vma, avc, anon_vma);
}
// Unlock
spin_unlock(&mm->page_table_lock);
anon_vma_unlock_write(anon_vma);
return 0;
}The function page_add_new_anon_rmap attaches the newly allocated physical page to the anon_vma and records the virtual address index.
void page_add_new_anon_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address, bool compound) {
int nr = compound ? hpage_nr_pages(page) : 1;
VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
__SetPageSwapBacked(page);
if (compound) {
VM_BUG_ON_PAGE(!PageTransHuge(page), page);
atomic_set(compound_mapcount_ptr(page), 0);
__inc_node_page_state(page, NR_ANON_THPS);
} else {
VM_BUG_ON_PAGE(PageTransCompound(page), page);
atomic_set(&page->mapcount, 0);
}
__mod_node_page_state(page_pgdat(page), NR_ANON_MAPPED, nr);
__page_set_anon_rmap(page, vma, address, 1);
}4.2 RMAP Handling During Child Process Creation
When fork creates a child, the kernel duplicates the parent's VMAs. The function dup_mmap iterates over the parent's VMA list, creates a copy for the child, and invokes anon_vma_fork to share or clone the anon_vma.
static __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) {
struct vm_area_struct *mpnt, *tmp;
int retval;
for (mpnt = oldmm->map; mpnt; mpnt = mpnt->vm_next) {
// Duplicate VMA
tmp = vm_area_dup(mpnt);
if (!tmp) {
retval = -ENOMEM;
goto free_vmas;
}
tmp->vm_mm = mm;
// Create anon_vma for child if needed
if (anon_vma_fork(tmp, mpnt)) {
__vma_link_rb(mm, tmp, rb_link, rb_parent);
}
// Copy PTEs unless VM_WIPEONFORK is set
if (!(tmp->vm_flags & VM_WIPEONFORK)) {
retval = copy_page_range(tmp, mpnt);
if (retval)
goto unlink_vma;
}
}
return 0;
unlink_vma:
__vma_unlink_rb(mm, tmp);
free_vmas:
return retval;
}The helper anon_vma_fork either clones the parent's anon_vma or allocates a new one and links it via an anon_vma_chain , preserving reference counts and parent/child relationships.
int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma) {
struct anon_vma_chain *avc;
struct anon_vma *anon_vma;
int error;
if (!pvma->anon_vma)
return 0;
error = anon_vma_clone(vma, pvma);
if (vma->anon_vma)
return 0;
anon_vma = anon_vma_alloc();
avc = anon_vma_chain_alloc(GFP_KERNEL);
anon_vma->root = pvma->anon_vma->root;
anon_vma->parent = pvma->anon_vma;
get_anon_vma(anon_vma->root);
vma->anon_vma = anon_vma;
// Link the chain
anon_vma_chain_link(vma, avc, anon_vma);
return 0;
}4.3 RMAP Usage in Page Reclaim and Migration
When memory pressure triggers the kswapd daemon, it calls try_to_unmap to unmap all user PTEs of an anonymous page. The function walks the reverse‑mapping structures and invokes try_to_unmap_one for each VMA.
int try_to_unmap(struct page *page, enum ttu_flags flags) {
int ret;
struct rmap_walk_control rwc = {
.rmap_one = try_to_unmap_one,
.arg = (void *)flags,
.done = page_not_mapped,
.anon_lock = page_lock_anon_vma_read,
};
VM_BUG_ON_PAGE(!PageHuge(page) && PageTransHuge(page), page);
if ((flags & TTU_MIGRATION) && !PageKsm(page) && PageAnon(page))
rwc.invalid_vma = invalid_migration_vma;
ret = rmap_walk(page, &rwc);
if (ret != SWAP_MLOCK && !page_mapped(page))
ret = SWAP_SUCCESS;
return ret;
}For page migration (e.g., NUMA balancing), the same RMAP walk finds all PTEs, invalidates them, allocates a new page on the target node, and updates the mappings.
5. Applications of Reverse Mapping RMAP
kswapd thread reclaims pages by unmapping all user PTEs that reference an anonymous page.
During page migration, the kernel must unmap all user PTEs of the source page before moving it.
The core function try_to_unmap is invoked by various kernel subsystems to perform these unmap operations.
6. Additional Reverse Mapping Types
6.1 KSM Reverse Mapping
KSM (Kernel Samepage Merging) merges identical anonymous pages across processes. It maintains a rmap_item list inside a stable_node to track each virtual address that maps to the shared physical page.
6.2 File Page Reverse Mapping
File‑backed pages also have reverse mappings. The physical page's mapping points to the file's address_space , whose i_mmap list contains all VMAs that map the file. The virtual address for a given VMA can be computed as:
virtual_address = page->index - vma->vm_pgoff + vma->vm_start;With this address, the kernel can locate and clear the corresponding PTEs during reclaim or migration.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.