Fundamentals 40 min read

Understanding Python Threads, Processes, GIL, and the multiprocessing & concurrent.futures Modules

This article explains the fundamental differences between threads and processes, the role of Python's Global Interpreter Lock, and provides a comprehensive guide to using the multiprocessing and concurrent.futures modules—including their main classes, synchronization primitives, and practical code examples—for effective concurrent programming in Python.

Architecture Digest

Jan 24, 2022

Understanding Python Threads, Processes, GIL, and the multiprocessing & concurrent.futures Modules

When learning Python, many developers encounter concurrency concepts such as threads, processes, and the Global Interpreter Lock (GIL). This guide clarifies these concepts and shows how to use the standard library modules multiprocessing and concurrent.futures to write efficient parallel code.

Thread and Process Differences

A process is the operating system's smallest unit of resource allocation, while a thread is the smallest unit of CPU scheduling. Processes have independent virtual address spaces; threads within the same process share that address space, making thread context switches cheaper than process switches.

Key Differences

Address space and resources: processes are isolated, threads share memory.

Communication: IPC for processes, direct memory access for threads (requires synchronization).

Scheduling and switching: thread switches are much faster.

In modern OSes, threads are a primary indicator of concurrency.

Comparison Table

Aspect

Multi‑process

Multi‑thread

Summary

Data sharing & synchronization

Complex sharing, simple sync

Simple sharing, complex sync

Each has pros and cons

Memory & CPU

High memory, complex switch, low CPU utilization

Low memory, simple switch, high CPU utilization

Threads win

Creation, destruction, switch

Complex, slow

Simple, fast

Threads win

Programming & debugging

Simple programming, simple debugging

Complex programming, complex debugging

Processes win

Reliability

Processes do not affect each other

One thread crash kills the whole process

Processes win

Distributed

Easy to scale across machines

Limited to multi‑core on a single machine

Processes win

Python Global Interpreter Lock (GIL)

The GIL is a mechanism in CPython that ensures only one thread executes Python bytecode at a time. It simplifies the interpreter implementation but prevents true parallelism on multi‑core CPUs for CPU‑bound code.

Acquire GIL.

Switch to a thread.

Run until a bytecode count limit or the thread voluntarily yields (e.g., sleep(0)).

Put the thread to sleep.

Release GIL.

Repeat.

Before Python 3.2 the GIL was released after I/O or every 100 bytecode ticks; from 3.2 onward a timed release (≈5 ms) improves fairness on multi‑core systems.

multiprocessing Module

Because the GIL limits multi‑threaded CPU usage, the multiprocessing package provides a process‑based parallelism API that mirrors the threading interface.

Process

Creates a new OS process. Constructor: Process([group, target, name, args, kwargs]). Important methods include start(), join(), terminate(), is_alive() and attributes such as pid and daemon.

from multiprocessing import Process
import os

def run_proc(name):
    print('Run child process %s (%s)...' % (name, os.getpid()))

if __name__ == '__main__':
    print('Parent process %s.' % os.getpid())
    p = Process(target=run_proc, args=('test',))
    print('Child process will start.')
    p.start()
    p.join()
    print('Child process end.')

Pool

Manages a fixed number of worker processes. Use apply, apply_async, map, map_async, imap, close, join, and terminate to control tasks.

from multiprocessing import Pool

def test(i):
    print(i)

if __name__ == '__main__':
    pool = Pool(8)
    pool.map(test, range(100))
    pool.close()
    pool.join()

Queue, JoinableQueue

Provides inter‑process communication. put(), get(), task_done(), and join() are the core methods.

from multiprocessing import Process, Queue
import os, time, random

def write(q):
    for v in ['A','B','C']:
        print('Put %s to queue...' % v)
        q.put(v)
        time.sleep(random.random())

def read(q):
    while True:
        v = q.get(True)
        print('Get %s from queue.' % v)

if __name__ == '__main__':
    q = Queue()
    pw = Process(target=write, args=(q,))
    pr = Process(target=read, args=(q,))
    pw.start(); pr.start()
    pw.join()
    pr.terminate()

Value and Array

Shared memory objects based on ctypes. They allow simple numeric or array sharing between processes.

import multiprocessing

def f(n, a):
    n.value = 3.14
    a[0] = 5

if __name__ == '__main__':
    num = multiprocessing.Value('d', 0.0)
    arr = multiprocessing.Array('i', range(10))
    p = multiprocessing.Process(target=f, args=(num, arr))
    p.start(); p.join()
    print(num.value)
    print(arr[:])

Pipe

Creates a two‑way communication channel returning (conn1, conn2). Use send() and recv() to exchange objects.

from multiprocessing import Process, Pipe
import time

def child(conn):
    time.sleep(1)
    conn.send('Hello from child')
    print('Parent says:', conn.recv())
    conn.close()

if __name__ == '__main__':
    parent_conn, child_conn = Pipe()
    p = Process(target=child, args=(child_conn,))
    p.start()
    print('Child says:', parent_conn.recv())
    parent_conn.send('Hi child')

Manager

Provides a server process that holds shared objects (list, dict, Namespace, Lock, etc.) which can be accessed via proxies from other processes.

import multiprocessing

def f(x, arr, lst, dct, ns):
    x.value = 3.14
    arr[0] = 5
    lst.append('Hello')
    dct[1] = 2
    ns.a = 10

if __name__ == '__main__':
    mgr = multiprocessing.Manager()
    num = mgr.Value('d', 0.0)
    arr = mgr.Array('i', range(10))
    lst = mgr.list()
    dct = mgr.dict()
    ns = mgr.Namespace()
    p = multiprocessing.Process(target=f, args=(num, arr, lst, dct, ns))
    p.start(); p.join()
    print(num.value, arr[:], lst, dct, ns)

Synchronization Primitives

Lock, RLock, Semaphore, Condition, and Event are available in multiprocessing to coordinate access to shared resources.

from multiprocessing import Process, Lock

def worker(lock, i):
    with lock:
        print('Hello from worker %s' % i)

if __name__ == '__main__':
    lock = Lock()
    for i in range(5):
        Process(target=worker, args=(lock, i)).start()

concurrent.futures Module

Provides a high‑level interface for asynchronous execution using thread or process pools.

ThreadPoolExecutor and ProcessPoolExecutor

Both inherit from Executor. Use submit(fn, *args, **kwargs) to schedule a callable and obtain a Future, or map(func, iterable) for ordered results.

from concurrent import futures
import time

def test(num):
    return time.ctime(), num

with futures.ThreadPoolExecutor(max_workers=2) as exe:
    future = exe.submit(test, 1)
    print(future.result())

Future API

Methods include result(), exception(), cancel(), done(), and utilities like as_completed() and wait() for handling multiple futures.

from concurrent.futures import ThreadPoolExecutor, as_completed
from time import sleep
from random import randint

def work(n):
    sleep(randint(1,5))
    return f'Result {n}'

pool = ThreadPoolExecutor(5)
futs = [pool.submit(work, i) for i in range(5)]
for f in as_completed(futs):
    print(f.result())

This guide equips Python developers with the knowledge to choose between threading, multiprocessing, and the higher‑level concurrent.futures APIs for building scalable, concurrent applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python concurrency multithreading parallelism GIL Multiprocessing

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.