Understanding Python List Memory Allocation and Append Behavior
This article explains how Python lists manage memory during initialization and appends, examines CPython's internal allocation algorithm, demonstrates memory growth with practical code examples, and recommends using list comprehensions for more efficient list creation.
The author, a senior operations engineer at NetEase Games, shares his interest in code performance and system principles by analyzing the behavior of Python's list data structure.
Python's list is a flexible dynamic array, and the append method is the most frequently used operation for extending a list.
Example usage: >> test = [] >>> test.append(1) >>> test.append({2}) >>> test.append([3]) >>> print(test) # Output: [1, set([2]), [3]]
Another example shows building a list with a loop: test = [] for i in range(4): test.append(i) print(test) # Output [0, 1, 2, 3]
Using sys.getsizeof the author observes that an empty list occupies 72 bytes, while a list with a single element adds only 8 bytes, suggesting that Python does not allocate a large fixed pool for future elements.
The CPython source for PyList_New shows that memory for the list items is allocated only when the size is greater than zero: PyObject * PyList_New(Py_ssize_t size) { PyListObject *op; size_t nbytes; ... if (size <= 0) op->ob_item = NULL; else { op->ob_item = (PyObject **) PyMem_MALLOC(nbytes); ... } Py_SIZE(op) = size; op->allocated = size; return (PyObject *) op; }
The list_resize function implements the over‑allocation strategy that provides amortized linear‑time performance for repeated appends: static int list_resize(PyListObject *self, Py_ssize_t newsize) { PyObject **items; size_t new_allocated; Py_ssize_t allocated = self->allocated; if (allocated >= newsize && newsize >= (allocated >> 1)) { Py_SIZE(self) = newsize; return 0; } new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6); if (new_allocated > PY_SIZE_MAX - newsize) return -1; new_allocated += newsize; if (newsize == 0) new_allocated = 0; items = self->ob_item; if (new_allocated <= (PY_SIZE_MAX / sizeof(PyObject *))) PyMem_RESIZE(items, PyObject *, new_allocated); else items = NULL; if (items == NULL) return -1; self->ob_item = items; Py_SIZE(self) = newsize; self->allocated = new_allocated; return 0; }
In this context, newsize represents the desired number of elements after the operation, while new_allocated is the total number of slots allocated for the list.
A table in the article illustrates how the allocated value changes during successive append calls, showing that allocation only grows when the current size exceeds certain thresholds.
A demonstration script measures memory growth: #coding: utf8 import sys test = [] raw_size = sys.getsizeof(test) test.append(1) print "1 次 append 减去空列表的内存大小:%s " % (sys.getsizeof(test) - raw_size) test.append(1) print "2 次 append 减去空列表的内存大小:%s " % (sys.getsizeof(test) - raw_size) ... (continues up to 6 appends) ... The output shows the memory increase stays at 32 bytes for the first four appends and jumps to 64 bytes at the fifth, confirming the over‑allocation pattern.
In conclusion, Python lists do not maintain a fixed pre‑allocation pool; instead, they grow using an over‑allocation algorithm that balances memory usage and performance. For creating lists where the final size is known, using a list comprehension (e.g., test = [i for i in range(4)] ) is more Pythonic and efficient than repeatedly calling append .
NetEase Game Operations Platform
The NetEase Game Automated Operations Platform delivers stable services for thousands of NetEase titles, focusing on efficient ops workflows, intelligent monitoring, and virtualization.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.