Fundamentals 8 min read

Why Python’s GIL Slows Multithreading and How It Actually Works

This article explains the purpose and mechanics of Python’s Global Interpreter Lock (GIL), how it serializes bytecode execution on single- and multi‑core CPUs, its impact on I/O‑bound versus CPU‑bound workloads, and why certain operations remain thread‑safe despite the lock.

MaGe Linux Operations

Jan 13, 2023

Why Python’s GIL Slows Multithreading and How It Actually Works

Understanding Python’s GIL Lock

GIL (Global Interpreter Lock) is not a feature unique to Python; it is a concept introduced in the CPython implementation. The official definition states:

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)

From the definition, GIL is a mutex that prevents multiple threads from executing Python bytecode simultaneously, which inevitably reduces execution efficiency. Understanding the need for GIL requires knowledge of CPython’s memory‑management thread safety.

First, let’s see how multithreaded tasks are scheduled on a single‑core CPU.

The diagram shows that because of the GIL, only one thread runs at any moment on a single‑core CPU. When a thread encounters an I/O operation or a timer tick expires, it releases the GIL, allowing the other two threads to compete for the lock before they can run.

There are two situations in which a thread releases the GIL: encountering I/O, or the timer tick expiring. I/O is easy to understand (e.g., sending an HTTP request and waiting for a response). The timer tick defines the maximum execution time for a thread; once exceeded, the GIL is automatically released.

Although both release the GIL, they differ. If Thread 1 releases the GIL due to I/O, Threads 2 and 3 compete for it while Thread 1 does not re‑enter the competition. If Thread 1 releases the GIL because its timer tick expires, all three threads can compete, and Thread 1 might win again. On a single‑core CPU this is not a severe problem because only one CPU is available, so utilization remains high.

On multi‑core CPUs, the global nature of the GIL prevents the advantages of multiple cores, dramatically reducing multithreaded efficiency.

Thread 1 runs on CPU 1, Thread 2 on CPU 2. Because the GIL is global, Thread 2 on CPU 2 must wait for Thread 1 on CPU 1 to release the lock before it can execute. If Thread 1 repeatedly wins the competition, Thread 2 remains idle, and the second CPU cannot contribute.

To avoid a single thread monopolizing the CPU, Python 3.x automatically adjusts thread priorities, improving multithreaded efficiency.

Since the GIL reduces multi‑core efficiency, why is it retained? The answer lies in thread safety.

Accurately speaking, the GIL provides coarse‑grained thread safety, which does not guarantee full safety for all operations. Consider the following example:

def add():
    global n
    for i in range(10**1000):
        n = n + 1

def sub():
    global n
    for i in range(10**1000):
        n = n - 1
n = 0
import threading
a = threading.Thread(target=add,)
b = threading.Thread(target=sub,)
a.start()
b.start()
# join blocks the main thread to avoid premature printing of n
a.join()
b.join()
print n

The program adds and subtracts the same amount, so theoretically n should be 0. However, running it prints a non‑zero value because Python statements are not atomic. For example, the operation n = n + 1 is broken into four bytecode steps:

19 LOAD_GLOBAL          1 (n)
22 LOAD_CONST           3 (1)
25 BINARY_ADD
26 STORE_GLOBAL         1 (n)

Thus, n = n + 1 is not atomic. According to the GIL release rules, while a thread executes these four steps, it may yield the GIL, interrupting the operation and leading to a race condition, which explains the non‑zero result.

The GIL’s coarse‑grained safety means it only guarantees a certain level of safety. Not every non‑I/O operation needs an additional lock. Some operations are atomic at the bytecode level, such as the list sort method: [1,4,2].sort() compiles to a single CALL_METHOD bytecode, which cannot be interrupted and is therefore thread‑safe.

Summary

For I/O‑bound applications, multithreading and multiprocessing show little difference. Even with the GIL, I/O operations cause the lock to be released, allowing other threads to run. Because thread communication costs are lower than process communication, multithreading is often preferred.

For CPU‑bound applications, multithreading is at an absolute disadvantage; using multiprocessing or coroutines is recommended.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

thread-safety multithreading GIL CPython

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.