Fundamentals 25 min read

How Python’s Virtual Machine Executes Bytecode: Step‑by‑Step Process

This article explains how the Python interpreter, after runtime initialization, creates a stack frame, invokes a series of C functions such as PyEval_EvalCode, _PyEval_Vector, and _PyEval_EvalFrameDefault to traverse and execute bytecode, and details the underlying runtime stack structures and macro APIs that support instruction evaluation.

Satori Komeiji's Programming Classroom
Satori Komeiji's Programming Classroom
Satori Komeiji's Programming Classroom
How Python’s Virtual Machine Executes Bytecode: Step‑by‑Step Process

Virtual Machine Execution Framework

When the interpreter starts, it first initializes the runtime environment, which is a global concept distinct from the per‑frame execution environment (the stack frame). Assuming the runtime is already initialized, the next step is to push the first domino that triggers the bytecode execution pipeline.

The interpreter consists of a compiler that produces a PyCodeObject and a virtual machine that executes it. The overall flow is illustrated by the following diagram:

Stack Frame Creation

Because Python is dynamic, variable bindings are determined at runtime, so the VM creates a stack frame from a PyCodeObject on the fly. Two primary C functions are used:

// Python/ceval.c
/* Create a stack frame based on PyCodeObject, globals and locals.
 * Suitable for simple module‑level code.
 */
PyObject *
PyEval_EvalCode(PyObject *co, PyObject *globals, PyObject *locals);

/* More parameters for functions, closures, defaults, etc.
 * This API is deprecated and no longer used internally.
 */
PyObject *
PyEval_EvalCodeEx(PyObject *_co, PyObject *globals, PyObject *locals,
                  const PyObject *args, int argcount,
                  const PyObject *kws, int kwcount,
                  const PyObject *defs, int defcount,
                  PyObject *kwdefs, PyObject *closure);

Both functions eventually call _PyEval_Vector, which in turn invokes the high‑level evaluation entry point.

Bytecode Evaluation Functions

The VM evaluates a frame using two public wrappers that finally delegate to _PyEval_EvalFrame:

// Python/ceval.c
PyObject *
PyEval_EvalFrame(PyFrameObject *f) {
    PyThreadState *tstate = _PyThreadState_GET();
    return _PyEval_EvalFrame(tstate, f->f_frame, 0);
}

PyObject *
PyEval_EvalFrameEx(PyFrameObject *f, int throwflag) {
    PyThreadState *tstate = _PyThreadState_GET();
    return _PyEval_EvalFrame(tstate, f->f_frame, throwflag);
}

Both wrappers ultimately call the internal function _PyEval_EvalFrame defined in pycore_ceval.h:

// Include/internal/pycore_ceval.h
static inline PyObject*
_PyEval_EvalFrame(PyThreadState *tstate,
                  struct _PyInterpreterFrame *frame,
                  int throwflag)
{
    EVAL_CALL_STAT_INC(EVAL_CALL_TOTAL);
    // If tstate->interp->eval_frame is NULL, fall back to the default implementation.
    if (tstate->interp->eval_frame == NULL) {
        return _PyEval_EvalFrameDefault(tstate, frame, throwflag);
    }
    // Custom eval_frame allows tools such as profilers or debuggers to replace the default.
    return tstate->interp->eval_frame(tstate, frame, throwflag);
}

Thus the core of bytecode execution resides in _PyEval_EvalFrameDefault, which will be examined in a subsequent article.

A diagram summarises the relationships among these functions:

Bytecode Traversal

The f_code field of a frame points to a PyCodeObject whose co_code holds the instruction stream. Each instruction is a uint8 value; arguments occupy the following byte. For example, the following Python snippet is compiled and disassembled:

code_string = """
a = 1
b = 2
c = a + b
"""
code_object = compile(code_string, "<file>", "exec")
print(code_object.co_consts)   # (1, 2, None)
print(code_object.co_names)    # ('a', 'b', 'c')

Disassembly shows the concrete instruction sequence:

0   0 RESUME               0
2   2 LOAD_CONST           0 (1)
4   4 STORE_NAME          0 (a)
6   6 LOAD_CONST           1 (2)
8   8 STORE_NAME          1 (b)
10 10 LOAD_NAME           0 (a)
12 12 LOAD_NAME           1 (b)
14 14 BINARY_OP           0 (+)
18 18 STORE_NAME          2 (c)
20 20 RETURN_CONST        2 (None)

The article later reconstructs the bytecode array and verifies that the reconstructed bytes match the original output.

Runtime Stack Structure

The runtime stack holds intermediate values because each bytecode instruction can have only one explicit argument. The stack is part of the localsplus array, which is divided into four regions: locals, cell variables, free variables, and the runtime stack.

Note: The stack frame object in C is _PyInterpreterFrame , while the Python‑level object is PyFrameObject . The article uses the former when discussing implementation details.

Key fields:

prev_instr : pointer to the previously executed instruction (uint16*), used to locate the instruction and its argument.

localsplus : flexible array storing locals, cell vars, free vars, and the runtime stack.

stacktop : offset of the stack top relative to localsplus.

Helper functions (from pycore_frame.h) retrieve the stack pointer and base:

// Get the localsplus array.
static inline PyObject** _PyFrame_GetLocalsArray(_PyInterpreterFrame *frame) {
    return frame->localsplus;
}

// Get the current stack pointer and reset stacktop.
static inline PyObject** _PyFrame_GetStackPointer(_PyInterpreterFrame *frame) {
    PyObject **sp = frame->localsplus + frame->stacktop;
    frame->stacktop = -1;
    return sp;
}

// Update stacktop after modifications.
static inline void _PyFrame_SetStackPointer(_PyInterpreterFrame *frame, PyObject **stack_pointer) {
    frame->stacktop = (int)(stack_pointer - frame->localsplus);
}

The stack base (bottom) is computed from the code object’s co_nlocalsplus field:

static inline PyObject** _PyFrame_Stackbase(_PyInterpreterFrame *f) {
    // co_nlocalsplus = co_nlocals + co_ncellvars + co_nfreevars
    return f->localsplus + f->f_code->co_nlocalsplus;
}

In older Python versions (e.g., 3.8) the frame contained a dedicated f_valuestack pointer.

Runtime Stack API Macros

The VM uses a collection of macros defined in ceval_macros.h to manipulate the stack efficiently. Important macros include: STACK_LEVEL() – current number of elements on the stack. STACK_SIZE() – maximum stack size from co_stacksize. STACK_GROW(n) / STACK_SHRINK(n) – adjust the stack pointer by n. PUSH(v) / POP() – push and pop values. TOP(), SECOND(), THIRD(), FOURTH() – access elements relative to the top. PEEK(n) – view the n ‑th element from the top without removing it. SET_TOP(v), SET_SECOND(v), POKE(n, v) – overwrite elements.

Examples demonstrate pushing two values without using PUSH by growing the stack and writing directly with POKE:

STACK_GROW(2);
POKE(1, 3);  // stack_pointer[-1] = 3
POKE(2, 2);  // stack_pointer[-2] = 2

Similarly, STACK_SHRINK removes elements by moving the pointer backward.

Conclusion

The article provides a macro‑level view of how Python’s virtual machine turns compiled bytecode into executable actions by creating a stack frame, iterating over co_code, and using a fixed‑size runtime stack manipulated through highly optimized C macros. Understanding these mechanisms lays the groundwork for deeper exploration of _PyEval_EvalFrameDefault and the implementation of individual bytecode instructions in future posts.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonbytecodeVirtual MachineinterpreterC APIRuntime StackPyEval
Satori Komeiji's Programming Classroom
Written by

Satori Komeiji's Programming Classroom

Python and Rust developer; I write about any topics you're interested in. Follow me! (#^.^#)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.