Big Data 18 min read

Parallel Computing and Python Multiprocessing: Concepts, Models, and Practical Examples

This article explains the fundamentals of parallel computing in the big‑data era, compares parallelism and concurrency, outlines GPU and distributed‑computing solutions, and provides a detailed guide to Python’s multiprocessing module with code examples, performance tests, and practical tips.

Python Programming Learning Circle

Feb 25, 2021

Parallel Computing and Python Multiprocessing: Concepts, Models, and Practical Examples

The author introduces the motivation for exploring parallel computing after a Python program took a whole day to run, and describes how using the multiprocessing library can dramatically improve performance.

In the current big‑data era, massive amounts of data are generated every minute (e.g., over 500 hours of video per minute on YouTube), creating challenges for efficient storage and processing.

CPU frequency growth has slowed since 2013, leading manufacturers to focus on multi‑core designs; typical desktop CPUs now have 4‑8 cores, making parallelism a practical way to leverage available hardware.

Parallelism means multiple work units run simultaneously on different CPU cores, while concurrency (multithreading without true parallel execution) interleaves tasks on a single core. Two main parallel programming models are described: Data Parallel (same operation on different data) and Message Passing (processes exchange messages).

The article lists scenarios for using parallel computing: compute‑intensive tasks on multi‑core CPUs, avoiding parallelism for single‑core compute‑intensive tasks, and limited benefits for I/O‑intensive workloads.

GPU advantages are highlighted: many cores, high floating‑point throughput, and massive data bandwidth, making GPUs ideal for deep‑learning and other heavy computations.

Distributed computing concepts are introduced via three seminal Google papers (GFS, MapReduce, BigTable) and the emergence of Hadoop, which implements these ideas for large‑scale data processing.

Before diving into Python multiprocessing, the article explains basic concepts of processes, threads, and the Global Interpreter Lock (GIL), noting that the GIL prevents true parallel execution of Python bytecode in a single process.

The multiprocessing module provides process‑based parallelism, bypassing the GIL. Its main interfaces are described: multiprocessing.Process(target=None, args=()) – create and control a process (methods: start(), join(), terminate()).

Example 1:

from multiprocessing import Process

def f(name):
    print('hello', name)

if __name__ == '__main__':
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()

multiprocessing.Pool([processes])

– a pool of worker processes with methods such as apply(), apply_async(), map(), map_async(), imap(), imap_unordered(), close(), terminate(), join().

Example 2:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    p = Pool(5)
    print(p.map(f, [1, 2, 3]))

Other primitives include multiprocessing.Pipe and multiprocessing.Queue for inter‑process communication, multiprocessing.Lock for synchronization, and shared memory objects multiprocessing.Value and multiprocessing.Array.

Additional methods such as multiprocessing.active_children(), multiprocessing.cpu_count(), and multiprocessing.current_process() are listed.

Important usage notes: avoid sharing mutable data, ensure objects are pickleable, prefer graceful shutdown over terminate(), clear queues before terminating, and pass resources explicitly to child processes.

Windows‑specific considerations are mentioned, such as protecting global variables across processes and guarding the entry point with if __name__ == '__main__': (and using freeze_support() when needed).

Practical experiments are presented, showing code that uses a lock, shared Value, and two processes adding different amounts to the shared counter. Three test cases compare execution time and output with/without locks and with different join() ordering, illustrating how locks and join() affect determinism and performance.

Further examples demonstrate the convenience of Pool for parallel mapping and asynchronous execution, emphasizing its suitability for data‑parallel workloads.

In conclusion, the author reflects on having organized the concepts of parallel computing, clarified the relationships among processes, threads, and CPUs, and explored Python’s multiprocessing library, noting many remaining learning opportunities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Python GPU Distributed Computing Multiprocessing

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.