Fundamentals 8 min read

Using Python multiprocessing to Accelerate CPU‑bound Tasks

This article explains how to use Python's multiprocessing module—including Process, Pool, map, apply_async, and shared data primitives—to parallelize CPU‑intensive work, improve performance, and demonstrates a real‑world example of parallel image downloading.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Using Python multiprocessing to Accelerate CPU‑bound Tasks

When writing Python programs that involve CPU‑intensive tasks, execution can be slow; using the multiprocessing module can improve performance by running multiple processes in parallel.

1. What is multiprocessing?

Multiprocessing runs several processes simultaneously, each with its own memory space, allowing true parallelism on multiple CPU cores, unlike multithreading which is limited by the GIL.

2. Install and import the module

The multiprocessing module is part of the Python standard library, so no extra installation is required. Import it with:

<code>import multiprocessing</code>

3. Simple multiprocessing example

3.1 Define a function

<code>def square(number):
    """Calculate the square of a number"""
    return number ** 2</code>

3.2 Use the Process class

Create a new process to compute the square of 10:

<code>if __name__ == "__main__":
    # Create a process
    process = multiprocessing.Process(target=square, args=(10,))
    # Start the process
    process.start()
    # Wait for it to finish
    process.join()
    print("Main program continues")
</code>

4. Use the Pool class for batch processing

The multiprocessing.Pool class manages a pool of worker processes, simplifying task distribution.

4.1 Using the map method

Apply a function to each element of an iterable:

<code>if __name__ == "__main__":
    with multiprocessing.Pool(processes=4) as pool:
        numbers = [1, 2, 3, 4, 5]
        results = pool.map(square, numbers)
        print(results)  # Output: [1, 4, 9, 16, 25]
</code>

5. Handling asynchronous tasks

Use apply_async to submit tasks without blocking:

<code>if __name__ == "__main__":
    with multiprocessing.Pool(processes=4) as pool:
        numbers = [1, 2, 3, 4, 5]
        results = []
        for number in numbers:
            result = pool.apply_async(square, (number,))
            results.append(result)
        for result in results:
            print(result.get())  # Output: 1, 4, 9, 16, 25
</code>

6. Sharing data between processes

Processes have separate memory, but multiprocessing provides Value and Array for shared data.

6.1 Using Value and Array

<code>from multiprocessing import Value, Array

def update_value(v):
    v.value += 1

def update_array(a):
    for i in range(len(a)):
        a[i] += 1

if __name__ == "__main__":
    shared_value = Value('i', 0)
    p1 = multiprocessing.Process(target=update_value, args=(shared_value,))
    p1.start(); p1.join()
    print(shared_value.value)  # 1

    shared_array = Array('i', [0, 0, 0])
    p2 = multiprocessing.Process(target=update_array, args=(shared_array,))
    p2.start(); p2.join()
    print(list(shared_array))  # [1, 1, 1]
</code>

7. Real‑world case: Parallel image download

7.1 Define the download function

<code>import requests

def download_image(url, filename):
    """Download an image and save it locally"""
    response = requests.get(url)
    with open(filename, 'wb') as file:
        file.write(response.content)
    print(f"Download complete: {filename}")
</code>

7.2 Use Pool to download concurrently

<code>if __name__ == "__main__":
    urls = [
        "https://example.com/image1.jpg",
        "https://example.com/image2.jpg",
        "https://example.com/image3.jpg",
        "https://example.com/image4.jpg",
    ]
    filenames = ["image1.jpg", "image2.jpg", "image3.jpg", "image4.jpg"]
    with multiprocessing.Pool(processes=4) as pool:
        pool.starmap(download_image, zip(urls, filenames))
</code>

Each image download runs in a separate process, dramatically reducing total download time.

Conclusion

The article demonstrates how to leverage Python's multiprocessing module—from basic process creation to pools, mapping, asynchronous execution, and shared data—to boost program performance, and provides a practical example of parallel image downloading.

performancePythonconcurrencyParallelismmultiprocessing
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.