Fundamentals 31 min read

How Linux’s kmalloc Memory Pool Works: From Slab Design to Allocation

This article explains the architecture and implementation of the Linux kernel kmalloc memory pool, detailing slab cache creation, size selection rules, cache initialization, allocation and free paths, and how different memory zones are handled, with code examples and diagrams.

Bin's Tech Cabin
Bin's Tech Cabin
Bin's Tech Cabin
How Linux’s kmalloc Memory Pool Works: From Slab Design to Allocation

This is the final article of the author’s slab series; for quick reference the previous four articles are listed:

《细节拉满,80 张图带你一步一步推演 slab 内存池的设计与实现》

《从内核源码看 slab 内存池的创建初始化流程》

《深入理解 slab cache 内存分配全链路实现》

《深度解析 slab 内存池回收内存以及销毁全流程》

These articles introduced the evolution of the slab memory pool architecture and the complete workflow of slab allocator creation, allocation, release and destruction in the kernel.

The slab memory pool was created to satisfy the kernel’s need for frequent small‑object allocations. Each core kernel data structure (e.g., task_struct, mm_struct, struct page, struct file, socket) gets its own dedicated slab cache, which manages fixed‑size objects.

Beyond these dedicated caches, the kernel also needs generic small allocations (8 B, 16 B, 32 B, …). The kmalloc subsystem provides a set of generic slab caches created at boot time via kmem_cache_create. The size of each cache is specified by the size argument.

struct kmem_cache *kmem_cache_create(const char *name, unsigned int size, unsigned int align, slab_flags_t flags, void (*ctor)(void *));

The kernel defines the available cache sizes in the kmalloc_info[] array. Each entry contains a cache name (e.g., kmalloc-32) and the object size in bytes.

const struct kmalloc_info_struct kmalloc_info[] = { {NULL,0}, {"kmalloc-96",96}, {"kmalloc-192",192}, {"kmalloc-8",8}, {"kmalloc-16",16}, {"kmalloc-32",32}, {"kmalloc-64",64}, {"kmalloc-128",128}, {"kmalloc-256",256}, {"kmalloc-512",512}, {"kmalloc-1k",1024}, {"kmalloc-2k",2048}, {"kmalloc-4k",4096}, {"kmalloc-8k",8192}, {"kmalloc-16k",16384}, {"kmalloc-32k",32768}, {"kmalloc-64k",65536}, {"kmalloc-128k",131072}, {"kmalloc-256k",262144}, {"kmalloc-512k",524288}, {"kmalloc-1M",1048576}, {"kmalloc-2M",2097152}, {"kmalloc-4M",4194304}, {"kmalloc-8M",8388608}, {"kmalloc-16M",16777216}, {"kmalloc-32M",33554432}, {"kmalloc-64M",67108864} };

For allocations up to 192 B the kernel uses a lookup table size_index[24] to map a requested size to the index of the appropriate cache in kmalloc_info. The table also contains special entries for the non‑power‑of‑two sizes 96 B and 192 B to reduce internal fragmentation.

static u8 size_index[24] = { 3,4,5,5,6,6,6,6,1,1,1,1,7,7,7,7,2,2,2,2,2,2,2,2 };

When the requested size exceeds 192 B, the kernel computes the highest set bit with fls(size‑1) to obtain the cache index.

static inline unsigned int size_index_elem(unsigned int bytes) { return (bytes‑1)/8; }

The two‑dimensional array kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH+1] stores the actual kmem_cache objects. The first dimension selects the memory zone (NORMAL, RECLAIM, DMA) based on the gfp_t flags; the second dimension selects the cache size.

struct kmem_cache *kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH+1];

During kernel boot, after the slab allocator is initialized, setup_kmalloc_cache_index_table() fills size_index, and create_kmalloc_caches(0) populates kmalloc_caches by calling new_kmalloc_cache() for each size and each zone. Special handling creates the 96 B and 192 B caches at indices 1 and 2.

static void __init new_kmalloc_cache(int idx, int type, slab_flags_t flags) { const char *name; if (type == KMALLOC_RECLAIM) { flags |= SLAB_RECLAIM_ACCOUNT; name = kmalloc_cache_name("kmalloc-rcl", kmalloc_info[idx].size); } else { name = kmalloc_info[idx].name; } kmalloc_caches[type][idx] = create_kmalloc_cache(name, kmalloc_info[idx].size, flags, 0, kmalloc_info[idx].size); }

The allocation path is:

Determine the cache type from gfp_t via kmalloc_type() (NORMAL, DMA or RECLAIM).

Find the cache index: use size_index for sizes ≤ 192 B, otherwise fls(size‑1).

Retrieve the cache from kmalloc_caches[type][index].

Allocate the object with slab_alloc().

static inline enum kmalloc_cache_type kmalloc_type(gfp_t flags) { #ifdef CONFIG_ZONE_DMA if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0)) return KMALLOC_NORMAL; return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM; #else return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL; #endif }

If the requested size exceeds KMALLOC_MAX_CACHE_SIZE (8 KB for the SLUB implementation), the kernel bypasses kmalloc and allocates pages directly from the buddy system via kmalloc_large().

#define KMALLOC_MAX_CACHE_SIZE (1UL << KMALLOC_SHIFT_HIGH) /* 8 KB */

Freeing memory uses kfree(). The function translates the virtual address to the owning struct page. If the page is not managed by a slab ( !PageSlab(page)), it is returned to the buddy system; otherwise the object is returned to its slab cache via slab_free().

void kfree(const void *x) { void *object = (void *)x; struct page *page = virt_to_head_page(x); if (unlikely(!PageSlab(page))) { __free_pages(page, order); return; } slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_); }

In summary, the kmalloc subsystem builds a hierarchy of generic slab caches covering object sizes from 8 B up to 8 KB (or larger in the raw kmalloc_info table). The caches are indexed by memory zone and size, created during early boot, and used by kmalloc() and kfree() to provide fast, low‑fragmentation allocation for kernel code.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KernelLinuxSlab Allocatorkmalloc
Bin's Tech Cabin
Written by

Bin's Tech Cabin

Original articles dissecting source code and sharing personal tech insights. A modest space for serious discussion, free from noise and bureaucracy.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.