Inside CPython’s Memory Manager: How PyMalloc Handles Small Object Allocation
This article explains CPython’s three‑layer memory allocator, PyMalloc, detailing how it organizes blocks, pools, and arenas to efficiently serve thousands of tiny object allocations while minimizing fragmentation and falling back to the system malloc for larger requests.
CPython Memory Manager
This article explains how CPython 3.9 implements its own memory allocator, PyMalloc, to efficiently handle the large number of small object allocations typical of a dynamic, object‑oriented language.
Memory architecture
Memory requests ≤ 512 bytes are served by a three‑level hierarchy: block → pool → arena . Blocks are the smallest units, pools (4 KB, page‑aligned) manage blocks of a single size class, and arenas (256 KB) contain multiple pools.
Key thresholds
#define SMALL_REQUEST_THRESHOLD 512– requests larger than this fall back to the system malloc. #define SMALL_MEMORY_LIMIT (64*1024*1024) – optional compile‑time limit on total pool memory.
Core data structures
struct arena_object {
uintptr_t address; // arena memory address
block *pool_address; // aligned start of first pool
uint nfreepools; // free pools in this arena
uint ntotalpools;
struct pool_header *freepools;
struct arena_object *nextarena;
struct arena_object *prevarena;
}; struct pool_header {
union { block *_padding; uint count; } ref; // allocated blocks
block *freeblock; // first free block
struct pool_header *nextpool;
struct pool_header *prevpool;
uint arenaindex;
uint szidx; // size class index
uint nextoffset; // offset of next unused block
uint maxnextoffset;
};Allocation flow
The function pymalloc_alloc first checks the request size. For small requests it computes a size‑class index and looks up a suitable pool in the usedpools hash‑like array. If a pool with free blocks exists, a block is taken from its freeblock list; otherwise a new pool is obtained from the current arena (or a new arena is created) and inserted into usedpools.
Deallocation flow
When pymalloc_free is called, the block’s address is aligned down to the containing pool using POOL_ADDR(p). The block is pushed onto the pool’s freeblock list. If the pool becomes empty, it is moved from usedpools to the arena’s freepools. When all pools in an arena are empty, the arena’s memory is returned to the system and the arena object is recycled.
Performance tricks
Blocks are 8‑byte aligned; the low three bits of a pointer can be used as flags, a technique also employed by glibc.
The usedpools array stores pairs of pointers so that lookup of a pool for a given size class is O(1).
Arenas are kept in an ordered list by the number of free pools, ensuring that the most empty arenas are freed first.
Overall, CPython’s three‑layer allocator reduces fragmentation, speeds up frequent small allocations, and gracefully falls back to the system allocator for larger requests.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
