Fundamentals 13 min read

Why Traditional Python Threading Tutorials Fail and How a Simple map Boosts Speed

This article critiques heavyweight Python threading tutorials, explains why the built‑in map function combined with multiprocessing or multiprocessing.dummy offers a concise, efficient way to parallelize I/O‑ and CPU‑bound tasks, and demonstrates dramatic speed‑ups with real code examples.

MaGe Linux Operations

Jun 10, 2022

Why Traditional Python Threading Tutorials Fail and How a Simple map Boosts Speed

Python's reputation for parallelism is tarnished, not only due to technical issues like the GIL but also because many tutorials teach heavyweight thread‑ and process‑based patterns that are hard to apply to everyday scripts.

Traditional examples

A quick search for “Python multithreading tutorial” shows most examples using classes, queues and producer/consumer models, which look more like Java code than Python.

import os
import PIL
from multiprocessing import Pool
from PIL import Image
SIZE = (75, 75)
SAVE_DIRECTORY = 'thumbs'

def get_image_paths(folder):
    return (os.path.join(folder, f)
            for f in os.listdir(folder)
            if 'jpeg' in f)

def create_thumbnail(filename):
    im = Image.open(filename)
    im.thumbnail(SIZE, Image.ANTIALIAS)
    base, fname = os.path.split(filename)
    save_path = os.path.join(base, SAVE_DIRECTORY, fname)
    im.save(save_path)

if __name__ == '__main__':
    folder = os.path.abspath('11_18_2013_R000_IQM_Big_Sur_Mon__e10d1958e7b766c3e840')
    os.mkdir(os.path.join(folder, SAVE_DIRECTORY))
    images = get_image_paths(folder)
    pool = Pool()
    pool.map(create_thumbnail, images)
    pool.close()
    pool.join()

These snippets require a template class, a queue, and explicit join calls, making them verbose and error‑prone.

The problem…

To use a worker pool you need a class, a queue, and methods on both ends of the channel, which quickly becomes boilerplate.

More workers, more problems

IBM’s classic tutorial shows a thread‑pool for web crawling, but the code is lengthy and involves manual thread management.

# Example2.py
'''A more realistic thread pool example'''
import time
import threading
import Queue
import urllib2

class Consumer(threading.Thread):
    def __init__(self, queue):
        threading.Thread.__init__(self)
        self._queue = queue
    def run(self):
        while True:
            content = self._queue.get()
            if isinstance(content, str) and content == 'quit':
                break
            response = urllib2.urlopen(content)
            print 'Bye byes!'

def Producer():
    urls = ['http://www.python.org', 'http://www.yahoo.com',
            'http://www.scala.org', 'http://www.google.com']
    queue = Queue.Queue()
    worker_threads = build_worker_pool(queue, 4)
    start_time = time.time()
    for url in urls:
        queue.put(url)
    for worker in worker_threads:
        queue.put('quit')
    for worker in worker_threads:
        worker.join()
    print 'Done! Time taken: {}'.format(time.time() - start_time)

def build_worker_pool(queue, size):
    workers = []
    for _ in range(size):
        worker = Consumer(queue)
        worker.start()
        workers.append(worker)
    return workers

if __name__ == '__main__':
    Producer()

Running this code requires constructing workers, handling queues, and performing joins, which is overkill for simple I/O tasks.

Why not just use map?

The built‑in map function (originating from Lisp) can apply a function to each element of a sequence, handling iteration and result collection automatically.

urls = ['http://www.yahoo.com', 'http://www.reddit.com']
results = map(urllib2.urlopen, urls)

Internally this is equivalent to a for‑loop that appends each result, but map does it in one line.

Python provides two libraries that expose a parallel map: multiprocessing (process‑based) and its lightweight sibling multiprocessing.dummy (thread‑based). multiprocessing.dummy is a drop‑in clone of multiprocessing that uses threads, making it suitable for I/O‑bound work while keeping the same API.

Hands‑on trial

Import the pools:

from multiprocessing import Pool
from multiprocessing.dummy import Pool as ThreadPool

Create a thread pool (default size equals CPU cores):

pool = ThreadPool()

Adjust the size when needed:

pool = ThreadPool(4)  # four worker threads

Rewrite the earlier example with just four lines of active code:

import urllib2
from multiprocessing.dummy import Pool as ThreadPool

urls = ['http://www.python.org', 'http://www.python.org/about/', ...]
pool = ThreadPool(4)
results = pool.map(urllib2.urlopen, urls)
pool.close()
pool.join()

Timing experiments on the author’s machine show:

# Single thread: 14.4 s
# 4‑thread pool: 3.1 s
# 8‑thread pool: 1.4 s
# 13‑thread pool: 1.3 s

Beyond nine threads the gains diminish.

Another real‑world example

Generating thumbnails for thousands of images is CPU‑intensive and benefits from parallelism.

Single‑process version

import os, PIL
from multiprocessing import Pool
from PIL import Image
SIZE = (75, 75)
SAVE_DIRECTORY = 'thumbs'

def get_image_paths(folder):
    return (os.path.join(folder, f) for f in os.listdir(folder) if 'jpeg' in f)

def create_thumbnail(filename):
    im = Image.open(filename)
    im.thumbnail(SIZE, Image.ANTIALIAS)
    base, fname = os.path.split(filename)
    save_path = os.path.join(base, SAVE_DIRECTORY, fname)
    im.save(save_path)

if __name__ == '__main__':
    folder = os.path.abspath('...')
    os.mkdir(os.path.join(folder, SAVE_DIRECTORY))
    images = get_image_paths(folder)
    for image in images:
        create_thumbnail(image)

Processing 6 000 images takes about 27.9 seconds.

Map‑based parallel version

import os, PIL
from multiprocessing import Pool
from PIL import Image
SIZE = (75, 75)
SAVE_DIRECTORY = 'thumbs'

def get_image_paths(folder):
    return (os.path.join(folder, f) for f in os.listdir(folder) if 'jpeg' in f)

def create_thumbnail(filename):
    im = Image.open(filename)
    im.thumbnail(SIZE, Image.ANTIALIAS)
    base, fname = os.path.split(filename)
    save_path = os.path.join(base, SAVE_DIRECTORY, fname)
    im.save(save_path)

if __name__ == '__main__':
    folder = os.path.abspath('...')
    os.mkdir(os.path.join(folder, SAVE_DIRECTORY))
    images = get_image_paths(folder)
    pool = Pool()
    pool.map(create_thumbnail, images)
    pool.close()
    pool.join()

The parallel run finishes in 5.6 seconds, a dramatic speed‑up.

By choosing the appropriate library—process‑based for CPU‑heavy work and thread‑based for I/O‑heavy work—and using map for parallel execution, developers can simplify code, avoid deadlocks, and achieve substantial performance gains.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python thread pool parallelism Multiprocessing map function

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.