Fundamentals 40 min read

Understanding Memory Pools: Concepts, Implementations, and Practical Use Cases

This article explains the concept of memory pools, how they reduce allocation overhead and fragmentation compared to traditional malloc/new, describes various pool designs and Linux kernel APIs, provides multiple C and C++ implementations, and discusses performance benefits and typical application scenarios such as servers, real‑time and embedded systems.

Deepin Linux

Mar 21, 2025

Understanding Memory Pools: Concepts, Implementations, and Practical Use Cases

Imagine a busy restaurant where waiters constantly arrange tables for new customers; similarly, in a computer program memory is a critical resource that must be allocated for new objects and released when they are no longer needed.

Traditional allocation functions such as malloc or new interact with the operating system on every request, which is analogous to fetching a new set of tables from a warehouse each time a customer arrives—slow and prone to creating fragmented, inefficient layouts. Frequent allocation and deallocation also cause internal and external memory fragmentation, degrading program performance.

1. Memory Pool Overview

1.1 Pooling Technique

Pooling is a common design pattern that pre‑allocates core resources (memory, threads, connections) and lets the program manage them internally, improving utilization and predictability. The most widely used pools are memory pools, thread pools, and connection pools.

1.2 What Is a Memory Pool?

A memory pool allocates a large contiguous block of memory at program start, then subdivides it into many small fixed‑size blocks (or pages). When the program needs memory it takes a block directly from the pool, and when the block is freed it is returned to the pool for later reuse. This reduces OS interaction, speeds up allocation/deallocation, and limits fragmentation.

2. Why Use a Memory Pool?

2.1 Fragmentation

Fragmentation occurs when free memory is split into many small, non‑contiguous pieces. Internal fragmentation wastes space inside allocated blocks; external fragmentation leaves unusable gaps between blocks. A memory pool mitigates both by managing fixed‑size chunks and optionally coalescing adjacent free chunks.

2.2 Allocation Efficiency

Repeatedly requesting memory from the OS is like asking parents for allowance each time you need cash—high overhead and latency. A pool provides a pre‑reserved reserve, allowing fast, constant‑time allocation.

2.3 Common Implementations

Fixed‑size buffer pool – for objects of a single size.

dlmalloc – Doug Lea’s general‑purpose allocator.

SGI STL allocator – uses multiple free‑lists for different size classes.

Boost object_pool – maintains a free‑node list and grows by powers of two.

TCMalloc – Google’s high‑performance allocator.

3. Core Principles of a Memory Pool

Initialization : At start‑up the pool requests a large block from the OS and splits it into chunks, linking them in a data structure such as a linked list or bitmap.

Allocation : When a request arrives, the pool selects a free chunk (for fixed‑size pools the head of the free list) and returns its address.

Deallocation : The returned chunk is marked free and re‑inserted into the free list; adjacent free chunks may be merged to reduce external fragmentation.

Below is a simple C++ example that demonstrates these steps:

#include <iostream>
#include <cstdlib>

// Define a memory block structure
struct MemoryBlock {
    size_t size;   // block size
    bool   isFree; // free flag
    MemoryBlock* next; // next block in the list
};

// Simple fixed‑size memory pool class
class MemoryPool {
public:
    MemoryPool(size_t poolSize, size_t blockSize)
        : poolSize(poolSize), blockSize(blockSize) {
        pool = static_cast<MemoryBlock*>(std::malloc(poolSize));
        if (!pool) { std::cerr << "Pool init failed" << std::endl; return; }
        MemoryBlock* cur = pool;
        for (size_t i = 0; i < poolSize / blockSize - 1; ++i) {
            cur->size = blockSize;
            cur->isFree = true;
            cur->next = cur + 1;
            cur = cur->next;
        }
        cur->size = blockSize; cur->isFree = true; cur->next = nullptr;
        freeList = pool;
    }
    ~MemoryPool() { std::free(pool); }
    void* allocate() {
        if (!freeList) { std::cerr << "No free blocks" << std::endl; return nullptr; }
        MemoryBlock* blk = freeList; freeList = freeList->next; blk->isFree = false; return blk;
    }
    void deallocate(void* p) {
        if (!p) return; 
        MemoryBlock* blk = static_cast<MemoryBlock*>(p);
        blk->isFree = true; blk->next = freeList; freeList = blk;
    }
private:
    MemoryBlock* pool;      // start of the pool
    MemoryBlock* freeList; // head of free list
    size_t poolSize;
    size_t blockSize;
};

4. Linux Kernel Memory Pool (mempool_t)

The kernel provides mempool_t in <linux/mempool.h>. It uses kmem_cache objects to manage pre‑allocated chunks. The main API functions are:

struct kmem_cache *kmem_cache_create(const char *name, size_t size,
                                      size_t align, unsigned long flags,
                                      void (*ctor)(void *));
void *kmem_cache_alloc(struct kmem_cache *cache, gfp_t flags);
void kmem_cache_free(struct kmem_cache *cache, void *obj);
int kmem_cache_init(void);
void kmem_cache_destroy(struct kmem_cache *cache);

Typical usage in a kernel module:

#include <linux/slab.h>
struct my_struct { int data; };
struct kmem_cache *my_cache;
void init_my_pool(void) { my_cache = kmem_cache_create("my_pool", sizeof(struct my_struct), 0, 0, NULL); }
void destroy_my_pool(void) { kmem_cache_destroy(my_cache); }
struct my_struct *alloc_from_my_pool(void) { return kmem_cache_alloc(my_cache, GFP_KERNEL); }
void free_to_my_pool(struct my_struct *p) { kmem_cache_free(my_cache, p); }

5. Advanced Designs

Various sophisticated pool designs exist:

Simple free‑list allocator – a linked list of free blocks; easy to implement but may have poor search performance.

Fixed‑size allocator – maintains separate free lists for each size class (e.g., 8, 16, 32 bytes) and can quickly allocate/deallocate.

Hash‑mapped free‑list pool – chooses a size class via a hash table, stores a cookie in the block header to know which allocator owns it.

Buddy system – splits memory into power‑of‑two blocks; suitable for page‑level allocation.

Slab allocator – pre‑creates caches of objects of the same size; variants include SLAB, SLUB, and SLOB.

6. Concurrent Memory Pool Example (C++)

A thread‑safe, growable pool based on a doubly‑linked list of elements:

#ifndef PPX_BASE_MEMORY_POOL_H_
#define PPX_BASE_MEMORY_POOL_H_
#include <climits>
#include <cstddef>
#include <mutex>
namespace ppx { namespace base {
    template<typename T, size_t BlockSize = 4096, bool ZeroOnDeallocate = true>
    class MemoryPool {
    public:
        using value_type = T;
        using pointer = T*;
        using size_type = size_t;
        MemoryPool() noexcept;
        ~MemoryPool() noexcept;
        pointer allocate(size_type n = 1, const void* hint = nullptr);
        void deallocate(pointer p, size_type n = 1);
        // ... other std::allocator members ...
    private:
        struct Element_ { Element_* pre; Element_* next; };
        Element_* data_element_ = nullptr;
        Element_* free_element_ = nullptr;
        std::recursive_mutex m_;
        void allocateBlock();
        // implementation details omitted for brevity
    };
    // definitions of constructors, allocate, deallocate, etc.
}}
#endif // PPX_BASE_MEMORY_POOL_H_

Usage example with multiple threads:

#include <iostream>
#include <thread>
using namespace std;
class Apple { public: Apple(){cout<<"Apple()"<<endl;} Apple(int id){cout<<"Apple("<<id<<")"<<endl;} ~Apple(){cout<<"~Apple()"<<endl;} void SetId(int i){id_=i;} int GetId(){return id_;} private: int id_; };
void ThreadProc(ppx::base::MemoryPool<char> *mp){ for(int i=0;i<100000;++i){ char* p0 = (char*)mp->allocate(); char* p1 = (char*)mp->allocate(); mp->deallocate(p0); char* p2 = (char*)mp->allocate(); mp->deallocate(p1); mp->deallocate(p2); } }
int main(){ ppx::base::MemoryPool<char> mp; for(int i=0;i<100000;++i){ char* p0 = (char*)mp.allocate(); char* p1 = (char*)mp.allocate(); mp.deallocate(p0); char* p2 = (char*)mp.allocate(); mp.deallocate(p1); mp.deallocate(p2); }
    thread th0(ThreadProc,&mp), th1(ThreadProc,&mp), th2(ThreadProc,&mp);
    th0.join(); th1.join(); th2.join();
    Apple* a = nullptr; {
        ppx::base::MemoryPool<Apple> mp2; a = mp2.newElement(10); a->SetId(10); mp2.deleteElement(a);
    }
    a->SetId(12);
    return 0; }

7. Application Scenarios

High‑frequency allocation servers – web servers can increase throughput by 30% and reduce latency by ~20% using pools.

Real‑time systems – flight control loops see latency drop from tens of ms to a few ms.

Embedded devices – memory utilization improves by ~25% and stability increases.

Game development – frame rates improve from ~40 FPS to >60 FPS when object pools replace frequent new/delete.

8. Pitfalls and Best Practices

Common issues include memory leaks (forgetting to return blocks), overflow (pool too small), and performance bottlenecks caused by inefficient search or overly heavy locking. Mitigation strategies are to use smart pointers, monitor pool usage, allow dynamic growth, and choose lock‑free or fine‑grained lock designs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance concurrency Linux kernel allocation C++memory pool

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.