Understanding Python's Global Interpreter Lock (GIL) and Its Impact
The article explains Python’s Global Interpreter Lock—its historical origins, how CPython’s tick‑based and later time‑slice schedulers manage thread execution, why it limits multi‑core performance, common multiprocessing workarounds, and the difficulties of removing it despite recent proposals for a GIL‑free build.
On September 7, the new programming language Mojo was announced, claiming performance up to 68,000× that of Python. The article links to a detailed experience report and uses this news as a springboard to discuss why many languages claim to be faster than Python and, more importantly, why Python’s Global Interpreter Lock (GIL) has become a notorious bottleneck.
The GIL is a global lock inside the CPython virtual machine that ensures only one thread executes Python bytecode at a time. It is not required by the Python language specification; alternative implementations such as Jython or IronPython run without a GIL.
Historically, the GIL used a tick‑based scheduler. Before Python 3.2, a thread would release the lock after executing a fixed number of bytecode instructions (default 100). This caused frequent lock contention and CPU spin‑waiting. The following snippet shows how a blocking operation (e.g., sleep ) explicitly releases the GIL:
static int pysleep(_PyTime_t timeout) {
...
int ret;
Py_BEGIN_ALLOW_THREADS
ret = clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &timeout_abs, NULL);
err = ret;
Py_END_ALLOW_THREADS
...
}Since Python 3.2 the GIL has been switched to a time‑slice based scheduler. Threads waiting for the lock set an atomic flag gil_drop_request , prompting the running thread to release the GIL. The main evaluation loop now checks this flag via eval_frame_handle_pending :
PyObject* _PyEval_EvalFrameDefault(...) {
// 计算主循环
main_loop:
for (;;) {
if (_Py_atomic_load_relaxed(eval_breaker)) {
opcode = _Py_OPCODE(*next_instr);
// ...
if (eval_frame_handle_pending(tstate) != 0) {
goto error;
}
}
}
}The helper that actually drops and reacquires the GIL looks like this:
static int eval_frame_handle_pending(PyThreadState *tstate) {
_PyRuntimeState * const runtime = &_PyRuntime;
struct _ceval_runtime_state *ceval = &runtime->ceval;
...
// 检查 gil_drop_request 是否被设为 1
if (_Py_atomic_load_relaxed(&ceval2->gil_drop_request)) {
// 释放 GIL
drop_gil(ceval, ceval2, tstate);
// 重新获取 GIL
take_gil(tstate);
}
...
}The switch interval can be inspected and modified from Python code:
>> import sys
>>> sys.getswitchinterval()
0.005
>>> sys.setswitchinterval(1)
>>> sys.getswitchinterval()
1.0Because the GIL prevents true parallel execution of Python bytecode, a common workaround is to use multiple processes. The multiprocessing module spawns separate Python interpreters, each with its own GIL:
from multiprocessing import Process, Pipe
def f(conn):
conn.send([42, None, 'hello'])
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print(parent_conn.recv()) # prints "[42, None, 'hello']"
p.join()While the GIL simplifies extension‑module development—allowing C extensions to manipulate Python objects without additional locking—it also hampers performance when extensions need to frequently interact with the interpreter. Removing the GIL would require fine‑grained locking or atomic reference‑count updates, both of which degrade single‑threaded performance.
Historically, the GIL was introduced because early Python was designed for single‑core CPUs and to keep the interpreter simple and C‑extension friendly. As multi‑core CPUs became dominant, the GIL turned from a convenience into a limitation. Proposals to eliminate it face strict constraints: preserving single‑thread performance, maintaining API compatibility, and ensuring the interpreter remains maintainable.
Recent discussions suggest adding a compile‑time flag --disable-gil to allow users to opt‑in to a GIL‑free build, providing a gradual migration path.
In summary, the GIL is a historically motivated design decision that enabled Python’s rapid adoption but now restricts multi‑core scalability. Understanding its implementation, impact, and the challenges of removing it is essential for developers working on performance‑critical Python applications.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.