Fundamentals 54 min read

Linux Page Reclaim Mechanism and PFRA Design Overview

This article explains Linux's page reclamation process, describing how the kernel recovers memory under pressure using PFRA, LRU algorithms, direct reclaim, compaction, and OOM handling, and provides detailed source code analysis of the involved kernel functions.

Deepin Linux

Jun 6, 2024

Linux Page Reclaim Mechanism and PFRA Design Overview

1. Overview

As Linux continuously allocates memory, increasing memory pressure triggers reclamation of both anonymous and file pages. Anonymous pages are swapped out to free frames, while clean file pages are released directly; dirty file pages are written back before being freed, increasing the pool of free page frames.

Page allocation first tries the low‑watermark fast path; if it fails, the allocator wakes kswapd to perform asynchronous reclamation and then attempts the minimum‑watermark path. Persistent shortage leads to direct reclamation, handling both swap‑backed anonymous pages and file‑backed pages.

When memory is tight, the kernel employs three main reclamation strategies:

Cache reclamation using LRU (Least Recently Used) to free rarely used pages.

Swapping out infrequently accessed pages to the swap partition.

Invoking the OOM killer to terminate memory‑hungry processes.

2. Page Reclaim Mechanism

The allocator attempts low‑watermark allocation; on failure it wakes kswapd and retries with the minimum watermark. If that also fails, it performs direct reclamation.

2.1 LRU Data Structure

Memory is organized as nodes, zones, and pages, represented by struct pglist_data, struct zone, and struct page. Each node contains an lruvec with five LRU lists:

typedef struct pglist_data {
    spinlock_t        lru_lock; // LRU lock
    struct lruvec     lruvec;   // LRU descriptor with 5 lists
    ...
} pg_data_t;

struct lruvec {
    struct list_head  lists[NR_LRU_LISTS]; // 5 LRU double‑linked lists
    ...
};

enum lru_list {
    LRU_INACTIVE_ANON = LRU_BASE,
    LRU_ACTIVE_ANON   = LRU_BASE + LRU_ACTIVE,
    LRU_INACTIVE_FILE = LRU_BASE + LRU_FILE,
    LRU_ACTIVE_FILE   = LRU_BASE + LRU_FILE + LRU_ACTIVE,
    LRU_UNEVICTABLE,
    NR_LRU_LISTS
};

The five lists are:

Inactive anonymous pages (rarely used anonymous pages).

Active anonymous pages (frequently used anonymous pages).

Inactive file pages (rarely used file‑backed pages).

Active file pages (frequently used file‑backed pages).

Unevictable pages (mlocked or otherwise non‑reclaimable).

2.2 Source‑level Reclaim Flow

The allocation path calls

alloc_page → alloc_pages_current → __alloc_pages_nodemask

. The slow‑path function __alloc_pages_slowpath (in mm/page_alloc.c) performs numerous checks, wakes kswapd, attempts direct compaction, and finally invokes direct reclaim via __alloc_pages_direct_reclaim if needed.

static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
                       struct alloc_context *ac)
{ ... }

Direct reclaim calls __perform_reclaim, which invokes try_to_free_pages to scan zones and free pages.

static int
__perform_reclaim(gfp_t gfp_mask, unsigned int order,
                   const struct alloc_context *ac)
{ ...
  progress = try_to_free_pages(ac->zonelist, order, gfp_mask, ac->nodemask);
  ...
}

try_to_free_pages

sets up a scan_control structure and calls do_try_to_free_pages, which iterates over zones and invokes shrink_zones to perform the actual reclamation.

unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
                                 gfp_t gfp_mask, nodemask_t *nodemask)
{ ...
  nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
  ...
}

The core reclamation work happens in shrink_node, which walks LRU lists, isolates pages, attempts to unmap them, writes back dirty pages, and finally frees or re‑activates them.

static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
{ ...
  shrink_zones(zonelist, sc);
  ...
}

2.3 Direct Page Reclaim

When asynchronous reclamation fails, __alloc_pages_direct_reclaim performs synchronous reclaim by calling __perform_reclaim and then allocating from the freed list.

static inline struct page *
__alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
                              unsigned int alloc_flags, const struct alloc_context *ac,
                              unsigned long *did_some_progress)
{ ... }

3. PFRA Design

3.1 Design Principles

PFRA (Page Frame Reclamation Algorithm) aims to steal pages from user processes and kernel caches, reclaim shared pages by dropping all page‑table references, and use LRU to select the least recently used pages based on accessed and referenced bits.

3.2 Reverse Mapping

Reverse mapping allows the kernel to locate all page‑table entries that reference a given physical page. For anonymous pages, the kernel follows the _mapcount and mapping fields to the anon_vma chain; for file‑backed pages it uses the address‑space's priority tree ( i_mmap) to find all vm_area_struct objects.

struct page {
    atomic_t _mapcount;
    union {
        struct address_space *mapping; // file‑backed or anon_vma pointer
        ...
    };
    ...
};

Functions try_to_unmap_anon and try_to_unmap_file walk these structures and invoke try_to_unmap_one to clear each PTE.

static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
                enum ttu_flags flags)
{ ... }

3.3 PFRA Implementation Details

PFRA relies on the LRU lists (active/inactive) defined in struct zone. Pages move between lists based on the PG_active and PG_referenced flags: accessed pages are promoted to the active list, while pages that remain unreferenced are demoted to inactive and eventually reclaimed.

struct zone {
    spinlock_t lru_lock;
    struct list_head active_list;
    struct list_head inactive_list;
    unsigned long nr_active;
    unsigned long nr_inactive;
    ...
};

To reduce lock contention, Linux batches page insertions using the LRU cache (a pagevec structure). Functions lru_cache_add() and lru_cache_add_active() accumulate pages in a pagevec and flush them to the appropriate LRU list once the vector is full.

struct pagevec {
    unsigned long nr;
    unsigned long cold;
    struct page *pages[PAGEVEC_SIZE];
};

Overall, PFRA combines careful selection of cache pages, age‑based ordering, and differentiated handling of page states to efficiently reclaim memory under pressure.

For further reading, see the linked articles on Linux kernel source analysis and advanced memory‑management techniques.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

linux LRU page reclaim PFRA

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.