Understanding PHP Variable Storage, Garbage Collection, and Memory Management
This article explains how PHP stores variables using the zval structure, manages memory with reference counting and copy‑on‑write, implements a garbage‑collector based on color marking, and allocates memory through a custom heap that handles small, large, and huge allocations.
Yang Tong, an R&D engineer in the New House R&D department, joined Lianjia (now Beike) in 2015 and has worked in big‑data and new‑house development.
In the previous chapter we analyzed PHP‑FPM process management; this chapter dives into the PHP kernel, starting with the simplest element – variable storage.
1. Variables
PHP variables consist of three parts: name, value, and type. The basic storage unit is the zval structure, whose definition (Zend/zend_types.h) is shown below.
struct _zval_struct {
zend_value value; // actual stored value, 8 bytes
union {
struct {
ZEND_ENDIAN_LOHI_4(
zend_uchar type, // variable type
zend_uchar type_flags, // type‑specific flags
zend_uchar const_flags,
zend_uchar reserved) // reserved field
} v;
uint32_t type_info;
} u1;
union {
uint32_t next; // hash‑collision handling for arrays
uint32_t cache_slot; // runtime cache
uint32_t lineno; // compilation line number
uint32_t num_args; // number of arguments
uint32_t fe_pos; // foreach position
uint32_t fe_iter_idx; // foreach iterator index
uint32_t access_flags; // class access attributes
uint32_t property_guard;
uint32_t extra;
} u2; // auxiliary fields used in different contexts
};Each zval_struct occupies 16 bytes due to memory alignment; the real data lives in zend_value (8 bytes) and u1 (12 bytes). The zend_value union is defined as:
typedef union _zend_value {
zend_long lval; // 64‑bit integer
double dval; // double
zend_refcounted *counted; // GC information pointer
zend_string *str; // string pointer
zend_array *arr; // array pointer
zend_object *obj; // object pointer
zend_resource *res; // resource pointer
zend_reference *ref; // reference pointer
zend_ast_ref *ast; // AST node pointer (compiler use)
zval *zv; // next zval (kernel use)
void *ptr; // generic pointer (kernel use)
zend_class_entry *ce; // class entry pointer (kernel use)
zend_function *func; // function pointer (kernel use)
struct { uint32_t w1; uint32_t w2; } ww; // generic fields
} zend_value;Only simple scalar types (long, double) are stored directly in zend_value; all other types store a pointer to a more complex structure. The article then compares PHP 7’s zval with PHP 5’s definition, highlighting three major differences: the kernel now uses zend_value for all structures, GC information has moved from zval to zend_value, and two new fields u1 and u2 were added for extensibility.
2. Garbage Collection
PHP employs reference counting combined with copy‑on‑write (COW). When a variable is assigned, its reference count is increased; a deep copy occurs only when a write operation is performed on a shared value. This avoids the massive memory overhead of naïve deep copies.
Reference‑counted structures are defined as:
typedef struct _zend_refcounted_h {
uint32_t refcount; // reference count
union {
struct { ZEND_ENDIAN_LOHI_3(
zend_uchar type,
zend_uchar flags,
uint16_t gc_info) v; };
uint32_t type_info;
} u;
} zend_refcounted_h;Only compound types (strings, arrays, objects, resources) carry a zend_refcounted_h. Simple scalars (int, float, bool, null) are not reference‑counted. PHP’s GC uses a tri‑color marking algorithm (black, grey, white, plus purple for already‑collected nodes). The relevant macros are:
#define GC_COLOR 0xc000
#define GC_BLACK 0x0000 // live object
#define GC_WHITE 0x8000 // garbage
#define GC_GREY 0x4000 // candidate for collection
#define GC_PURPLE 0xc000 // already collectedThe GC root buffer stores pointers to reference‑counted objects:
typedef struct _gc_root_buffer {
zend_refcounted *ref; // pointer to the object
struct _gc_root_buffer *next; // next in double‑linked list
struct _gc_root_buffer *prev; // previous in list
uint32_t refcount; // temporary refcount during collection
} gc_root_buffer;Collection proceeds in three phases: (1) mark all roots grey and decrement the refcount of their children, (2) scan again to identify true garbage (refcount = 0) and move them to a free list, (3) finally free the white objects. The article notes that PHP’s GC is simple and not as sophisticated as Java’s stop‑the‑world collectors, which is acceptable because PHP typically runs short‑lived web requests.
3. Memory Management
PHP implements its own memory manager (similar to tcmalloc) to avoid frequent system calls. The central structure is zend_mm_heap, which contains lists for huge allocations, the main chunk, cached chunks, and free slots for small allocations.
struct _zend_mm_heap {
// ... (omitted for brevity) ...
zend_mm_huge_list *huge_list; // linked list of huge (≥2 MiB) blocks
zend_mm_chunk *main_chunk; // first chunk that also stores the heap itself
zend_mm_chunk *cached_chunks; // reusable chunks
zend_mm_free_slot *free_slot[ZEND_MM_BINS]; // arrays of free small slots
};Huge allocations (> 2 MiB) are handled by zend_mm_alloc_huge, which creates one or more zend_mm_huge_list entries and maps memory via mmap. Large allocations (between 3 KB and 2 MiB) are satisfied by allocating whole pages (4 KB each) from chunks; the allocator searches for a contiguous run of free pages using bitmap structures ( free_map and map) inside each zend_mm_chunk. Small allocations (< 3 KB) are served from pre‑allocated slots; each slot size has its own free list ( zend_mm_free_slot) and the allocator quickly returns the head of the appropriate list.
The allocator distinguishes three size classes:
HUGE – whole chunks (≥ 2 MiB) managed via zend_mm_huge_list.
LARGE – one or more pages (4 KB) managed inside chunks.
SMALL – slots of predefined sizes (8 B … 3072 B) stored in free_slot arrays.
When a request ends, zend_mm_shutdown releases all request‑local allocations but keeps persistent allocations (created with pemalloc(..., true)) alive for the lifetime of the process.
4. Summary
This chapter covered three core topics of the PHP engine: variable storage, garbage collection, and memory management. Variable storage and the custom memory manager are the most performance‑critical parts, while the GC is relatively simple and rarely a bottleneck in typical web workloads. The next chapter will examine how PHP parses and executes user code.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Beike Product & Technology
As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
