Fundamentals 13 min read

Understanding Python's Copy‑and‑Patch JIT Compiler in CPython 3.13

This article explains the concept of Just‑In‑Time compilation for Python, introduces the copy‑and‑patch JIT proposed for CPython 3.13, shows how it works with bytecode templates, compares it to traditional JIT approaches, and presents simple benchmark results and implementation details.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Understanding Python's Copy‑and‑Patch JIT Compiler in CPython 3.13

In late December 2023, CPython core developer Brandt Bucher submitted a pull‑request to add a JIT compiler to the Python 3.13 branch, which would be one of the biggest changes to the interpreter since the adaptive interpreter introduced in Python 3.11.

What is JIT?

JIT (Just‑In‑Time) compilation means compiling code on its first execution. While Python already compiles source to bytecode, a true JIT usually emits machine code directly, unlike Ahead‑Of‑Time (AOT) compilers such as GCC or rustc.

Python bytecode is platform‑independent, high‑level, and requires a special interpreter loop to execute.

It has no meaning for the CPU without the interpreter loop.

Each instruction corresponds to roughly a thousand machine instructions.

It is type‑agnostic.

It is cross‑platform.

For a simple function def func(): a = 1 return a , disassembly shows five bytecode instructions.

<code>import dis
dis.dis(func)
# 34   0 RESUME
# 35   2 LOAD_CONST               1 (1)
#       4 STORE_FAST               0 (a)
# 36   6 LOAD_FAST                0 (a)
#       8 RETURN_VALUE
</code>

A more elaborate interpreter written in Python can walk the bytecode and execute it with a stack and variable dictionary.

<code>def interpret(func):
    stack = []
    variables = {}
    for instruction in dis.get_instructions(func):
        if instruction.opname == "LOAD_CONST":
            stack.append(instruction.argval)
        elif instruction.opname == "LOAD_FAST":
            stack.append(variables[instruction.argval])
        elif instruction.opname == "STORE_FAST":
            variables[instruction.argval] = stack.pop()
        elif instruction.opname == "RETURN_VALUE":
            return stack.pop()
</code>

Running print(interpret(func)) yields the same result as the original function, but the interpreter loop adds overhead. Repeating the function thousands of times makes this overhead noticeable, which is why a JIT can improve performance.

Copy‑and‑Patch JIT

The copy‑and‑patch JIT, proposed in 2021, works by copying the C implementation of each bytecode instruction and patching in runtime‑specific values (such as the opcode argument). The generated machine‑code snippets are then stitched together and executed directly.

<code>def copy_and_patch_interpret(func):
    code = 'def f():\n'
    code += '  stack = []\n'
    code += '  variables = {}\n'
    for instruction in dis.get_instructions(func):
        if instruction.opname == "LOAD_CONST":
            code += f'  stack.append({instruction.argval})\n'
        elif instruction.opname == "LOAD_FAST":
            code += f'  stack.append(variables["{instruction.argval}"])\n'
        elif instruction.opname == "STORE_FAST":
            code += f'  variables["{instruction.argval}"] = stack.pop()\n'
        elif instruction.opname == "RETURN_VALUE":
            code += '  return stack.pop()\n'
    code += 'f()'
    return code
</code>

The resulting Python source is a straight‑line function without the interpreter loop, which can be compiled once and executed repeatedly for better speed.

<code>def f():
  stack = []
  variables = {}
  stack.append(1)
  variables["a"] = stack.pop()
  stack.append(variables["a"])
  return stack.pop()
f()
</code>

Benchmarking shows the copy‑and‑patch version runs faster (2‑9% improvement) despite the modest gain, because the underlying CPython interpreter is already compiled C code.

Why Use Copy‑and‑Patch?

Full JITs must translate high‑level bytecode to low‑level machine instructions for many CPU architectures, which is complex and memory‑intensive. Copy‑and‑patch JITs instead generate IL‑like templates and patch them at runtime, leveraging existing LLVM JIT tools without embedding a full compiler into CPython.

When compiled with --enable-experimental-jit , CPython generates machine‑code stencils for each opcode, e.g., for LOAD_CONST :

<code>frame->instr_ptr = next_instr;
next_instr += 1;
INSTRUCTION_STATS(LOAD_CONST);
PyObject *value;
value = GETITEM(FRAME_CO_CONSTS, oparg);
Py_INCREF(value);
stack_pointer[0] = value;
stack_pointer += 1;
DISPATCH();
</code>

These stencils contain holes (e.g., JIT_OPARG , JIT_CONTINUE ) that are filled with actual values when the function is JIT‑compiled.

<code>static const Hole _LOAD_CONST_code_holes[3] = {
    {0xd, HoleKind_X86_64_RELOC_UNSIGNED, HoleValue_OPARG, NULL, 0x0},
    {0x46, HoleKind_X86_64_RELOC_UNSIGNED, HoleValue_CONTINUE, NULL, 0x0},
};
</code>

At runtime the JIT copies these machine‑code templates, patches the holes with the concrete bytecode arguments, stores the resulting code in memory, and executes it directly for each Python function call.

Performance and Outlook

Initial benchmarks show modest speedups, but the JIT serves as a foundation for larger optimizations such as constant propagation and loop lifting. Future versions may broaden JIT coverage beyond functions containing JUMP_BACKWARD opcodes.

While the first JIT version may not dramatically change benchmark scores, it opens the door to substantial performance improvements for CPython.

performancePythonJITInterpreterCPythonCopy-and-Patch
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.