Master Python Multiprocessing: From Basics to Advanced Process Management
This article explains Python's multiprocessing module, covering process concepts, creation of single and multiple processes, process pools, locks, inter‑process communication methods such as Event, Pipe, Queue, semaphores, and data sharing techniques, with code examples and visual illustrations.
Preface
A process is an operating‑system entity that runs a program; a system may have one or many processes, allocated based on CPU cores.
Below are screenshots of the Windows Task Manager showing many processes created by the 360 browser.
The Resource Monitor also displays detailed process and thread usage.
1. Basic Usage
Processes execute programs and can contain multiple threads. Creating too many processes can waste resources unless building a large system.
1.1 Create Process
1. Import the module
import multiprocessing as mThe above import is incorrect for creating processes; the correct import is:
from multiprocessing import Process
Process(group, target, args, kwargs, name)Parameters:
group: user group
target: function to run
args: argument tuple
kwargs: argument dict
name: child process nameCommon utility methods:
# List alive child processes (may wait for termination)
multiprocessing.active_children()
# Number of CPU cores
multiprocessing.cpu_count()2. Create a single process
Key methods:
# Start process (calls run())
start()
# Run method
run()
# Force termination (no cleanup)
terminate()
# Check if alive
is_alive()
# Wait for termination (join)
join([timeout])
# Set as daemon (must before start)
daemon
# Process name
name
# Process ID (available after start)
pid
# Exit code (None if not terminated)
exitcode
# Authentication key
authkey
# Sentinel handle
sentinel
# Kill process
kill()
# Close process
close()Always guard process creation with:
if __name__ == '__main__':Processes can also be created by subclassing Process.
3. Create multiple processes
Use a loop to start several processes, improving speed.
4. Process Pool
Pool simplifies resource management by reusing a fixed number of worker processes.
from multiprocessing import Pool
import multiprocessing as m
num = m.cpu_count()
pool = multiprocessing.Pool(num)Common pool methods:
apply(func, args, kwargs) # Synchronous (blocking)
apply_async(func, args, kwargs) # Asynchronous (non‑blocking)
terminate() # Force stop, discard pending tasks
join() # Wait for workers to exit (after close/terminate)
close() # Prevent new tasks, wait for completion
map(func, iterable, chunksize=int) # Parallel map, blocks until results
map_async(func, iterable, chunksize, callback, error_callback)
imap(func, iterable, chunksize) # Lazy iterator version
imap_unordered(func, iterable, chunksize)
starmap(func, iterable, chunksize)For web crawlers, small tasks can use synchronous execution, while large crawls benefit from asynchronous (parallel) execution.
Serial example
Parallel example
5. Locks
Locks synchronize access to shared resources.
from multiprocessing import LockRe‑entrant locks ( RLock) allow the same process to acquire the lock multiple times.
import time
lock1 = RLock()
lock2 = RLock()
s = time.time()
def jc(num):
lock1.acquire()
lock2.acquire()
print('start')
print(m.current_process().pid, 'run----', str(num))
lock1.release()
lock2.release()
print('end')
if __name__ == '__main__':
aa = []
for y in range(12):
pp = Process(target=jc, args=(y,))
pp.start()
aa.append(pp)
for x in aa:
x.join()
e = time.time()
print(e - s)6. Inter‑process Communication
Event
import time
e = Event()
def main(num):
while True:
if num < 5:
e.clear() # clear signal
print('clear')
if num >= 5:
e.wait(timeout=1) # wait for signal
e.set()
print('set')
if num == 10:
e.wait(timeout=3)
e.clear()
print('exit')
break
num += 1
time.sleep(2)
if __name__ == '__main__':
for y in range(10):
pp = Process(target=main, args=(y,))
pp.start()
pp.join()Pipe
p1, p2 = m.Pipe(duplex=bool) # duplex=True for full‑duplex
p1.send(data) # send
p2.recv() # receive
p1.close() # close connection
p1.fileno() # file descriptor
p1.poll([timeout]) # check if data available
p2.recv_bytes([maxlength])
p1.send_bytes([maxlength])
p2.recv_bytes_into(buffer, [offset])Queue
def fd(a):
for y in range(10):
a.put(y) # insert
print('insert:', str(y))
def df(b):
while True:
aa = b.get(True) # remove
print('release:', str(aa))
if __name__ == '__main__':
q = Queue()
ff = Process(target=fd, args=(q,))
dd = Process(target=df, args=(q,))
ff.start()
dd.start()
dd.terminate()
ff.join()7. Semaphore
s = Semaphore(3)
s.acquire()
print(s.get_value())
s.release()
print(s.get_value())
print(s.get_value())
s.release()
print(s.get_value())
s.release()8. Data Sharing
# Value type
m.Value()
# Array type
m.Array()
# Dict type
m.dict()
# List type
m.list()
# Manager shared objects
Manager().dict()
Manager().list()Conclusion
The article provides a comprehensive overview of Python processes, demonstrating creation, pooling, synchronization, communication, and shared data techniques, enabling readers to apply multiprocessing effectively in their projects.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
