Fundamentals 50 min read

How Linux’s Buddy Allocator Powers Efficient Memory Management

This article explains the Linux kernel’s buddy memory allocation algorithm, covering its core principles, data structures, allocation and free workflows, performance advantages and drawbacks, monitoring tools, tuning parameters, and real‑world use cases such as Kubernetes, embedded IoT devices, and real‑time operating systems.

Deepin Linux
Deepin Linux
Deepin Linux
How Linux’s Buddy Allocator Powers Efficient Memory Management

In the complex Linux kernel ecosystem, memory management is the "heart" of system stability; the buddy algorithm acts as a precise "memory steward" that dynamically splits and merges memory blocks to keep allocation efficient and fragmentation low.

1. Buddy Algorithm Overview

The algorithm mitigates external fragmentation by repeatedly halving larger free blocks until a block of the required size is obtained, and by merging free "buddy" blocks during reclamation.

2. Core Principles

2.1 What is a "buddy"?

A buddy pair consists of two blocks of equal size that are contiguous in physical memory and originate from the same larger block, similar to cutting a cake into equal pieces.

2.2 Data structures for block classification

The kernel uses an array indexed by order, where order represents block size as 2^order pages (e.g., order 0 = 1 page, order 10 = 1024 pages). Each entry holds a doubly‑linked free_list of blocks and a counter nr_free of available blocks.

2.3 Buddy management structures

#define MAX_ORDER 11
struct zone {
    ...
    struct free_area free_area[MAX_ORDER];
    ...
}
struct free_area {
    struct list_head free_list;
    unsigned long nr_free;
};

The kernel maintains a free_area array for each memory zone, allowing fast lookup of free blocks of a given order.

Buddy algorithm management structure
Buddy algorithm management structure

3. Allocation and Reclamation Workflow

3.1 Allocation process

When a request for, say, 256 pages arrives (order 8), the allocator first checks the corresponding free list. If empty, it searches higher orders (order 9, order 10). Upon finding a larger block, it splits it repeatedly until a block of the desired size is produced, inserting the remaining halves back into the appropriate free lists.

3.2 Reclamation process

When a block is freed, the allocator checks whether its buddy is also free. If so, the two are merged into a higher‑order block, and the process repeats recursively, reducing external fragmentation.

3.3 Initialization and release

During kernel boot,

start_kernel → mem_init → free_all_bootmem_node → __free_pages

walks every page and inserts it into the buddy system, establishing the initial free lists.

static inline void __free_one_page(struct page *page,
                                 struct zone *zone, unsigned int order)
{
    unsigned long page_idx;
    int order_size = 1 << order;
    int migratetype = get_pageblock_migratetype(page);
    page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);
    __mod_zone_page_state(zone, NR_FREE_PAGES, order_size);
    while (order < MAX_ORDER-1) {
        struct page *buddy = __page_find_buddy(page, page_idx, order);
        if (!page_is_buddy(page, buddy, order))
            break;
        list_del(&buddy->lru);
        zone->free_area[order].nr_free--;
        rmv_page_order(buddy);
        unsigned long combined_idx = __find_combined_index(page_idx, order);
        page = page + (combined_idx - page_idx);
        page_idx = combined_idx;
        order++;
    }
    set_page_order(page, order);
    list_add(&page->lru, &zone->free_area[order].free_list[migratetype]);
    zone->free_area[order].nr_free++;
}

4. Advantages and Drawbacks

4.1 Advantages

Eliminates external fragmentation by keeping blocks in powers‑of‑two sizes.

Allocation and free are fast, using simple bit operations and linked‑list manipulations.

Strong adaptability to modern architectures, including NUMA systems where each node manages its own buddy lists.

4.2 Disadvantages

Internal fragmentation can be significant because allocations are rounded up to the nearest power of two.

Strict buddy‑pair requirements may leave free memory unusable for large contiguous requests, leading to external fragmentation in practice.

Splitting and merging involve many list and bitmap updates, adding overhead.

5. Observation and Optimization

5.1 System status

Reading /proc/buddyinfo shows the number of free blocks per order for each zone, helping to assess fragmentation levels.

Node 0, zone DMA      399 260 123 86 34 31 16 14 7 9 606
Node 0, zone Normal 1491 1005 233 23 53 28 12 25 9

Low counts in high orders indicate severe fragmentation.

5.2 Kernel parameters

vm.max_order : controls the maximum block size (default 11 → 4 MiB). Raising it allows larger allocations at the cost of more metadata.

Migration type strategy : prioritising movable pages reduces fragmentation; adjusting /sys/devices/virtual/memory/zone*/pagesize can influence behaviour.

6. Application Scenarios

6.1 High‑concurrency servers (Kubernetes)

Containers frequently allocate and free memory; the buddy allocator, combined with cgroups, ensures fair distribution and prevents any single container from starving others.

#include <iostream>
#include <vector>
#include <unordered_map>
#include <algorithm>
#include <string>
#include <cmath>
#include <memory>

// Simplified buddy allocator and Kubernetes‑style manager (code omitted for brevity)

6.2 Embedded IoT devices

On memory‑constrained boards (few MiB), the deterministic behaviour of the buddy system guarantees low latency and predictable power consumption.

#include <cstddef>
#include <cstdint>
#include <cstring>
#include <algorithm>
#include <new>
#include <stdexcept>

// Embedded buddy allocator implementation (code omitted for brevity)

6.3 Real‑time operating systems

Deterministic allocation time makes the buddy algorithm suitable for safety‑critical control loops where millisecond‑level predictability is required.

7. Summary

The Linux kernel’s buddy memory allocator provides a fast, power‑of‑two based mechanism that balances allocation speed, fragmentation control, and architectural adaptability. Proper monitoring ( /proc/buddyinfo, vmstat, sar) and tuning of vm.max_order and migration policies can further optimise performance for workloads ranging from cloud‑native Kubernetes clusters to tiny embedded sensors.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Memory ManagementLinuxfragmentationbuddy allocator
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.