Fundamentals 14 min read

Master Python Threading: From Basics to Advanced Concurrency Techniques

This article provides a comprehensive guide to Python threading, covering core concepts such as threads, locks, RLock, conditions, semaphores, events, local storage, and timers, along with practical code examples for creating, managing, and synchronizing threads to improve concurrent program performance.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Master Python Threading: From Basics to Advanced Concurrency Techniques

Preface

Before today's knowledge points, do you understand threads, processes, and coroutines? Let's briefly review them.

Threads

The CPU's scheduling unit, simply the executor at the end of a program, similar to a junior role.

Python threads are often called a "chicken rib" because of the GIL, but they are useful for I/O operations, though less effective for CPU-bound tasks. Below is how to use threads.

1. Import threading module

import threading as t

2. Thread usage

tt = t.Thread(group=None, target=None, name=None, args=(), kwargs={}, daemon=None)
# Thread methods:
tt.start()          # Activate thread
tt.getName()        # Get thread name
tt.setName()        # Set thread name
tt.name             # Get or set thread name
tt.is_alive()       # Check if thread is alive
tt.isAlive()        # Deprecated alias
tt.setDaemon()       # Set daemon flag (default False)
tt.isDaemon()       # Check daemon flag
tt.ident            # Thread identifier (valid after start)
tt.join()            # Wait for thread to finish
tt.run()             # Execute thread's run method
# Thread module methods:
t.active_count()    # Number of active threads
t.enumerate()       # List of active Thread objects
t.current_thread().getName()  # Current thread name
t.TIMEOUT_MAX       # Global timeout value

3. Create threads

Threads can be created using the Thread class or by overriding the run method; they can be single or multi-threaded.

1) Using Thread class

Single thread

def xc():
    for y in range(100):
        print('运行中' + str(y))
tt = t.Thread(target=xc, args=())
tt.start()
tt.join()

Multiple threads

def xc(num):
    print('运行:' + str(num))
c = []
for y in range(100):
    tt = t.Thread(target=xc, args=(y,))
    tt.start()
    c.append(tt)
for x in c:
    x.join()

2) Overriding Thread class

Single thread

class Xc(t.Thread):
    def __init__(self):
        super().__init__()
    def run(self):
        for y in range(100):
            print('运行中' + str(y))
x = Xc()
x.start()
x.join()
# Equivalent: Xc().run()

Multiple threads

class Xc(t.Thread):
    def __init__(self):
        super().__init__()
    def run(self, num):
        print('运行:' + str(num))
x = Xc()
for y in range(10):
    x.run(y)

4. Thread locks

Locks prevent race conditions when multiple threads access the same object.

1) Lock

Usage:

# Acquire lock (blocking, optional timeout)
Lock.acquire(blocking=True, timeout=1)
# Release lock
Lock.release()

Example:

n = 10
lock = t.Lock()
def xc(num):
    lock.acquire()
    print('运行+:' + str(num + n))
    print('运行-:' + str(num - n))
    lock.release()
c = []
for y in range(10):
    tt = t.Thread(target=xc, args=(y,))
    tt.start()
    c.append(tt)
for x in c:
    x.join()

Deadlock example:

n = 10
lock1 = t.Lock()
lock2 = t.Lock()
def xc(num):
    lock1.acquire()
    print('运行+:' + str(num + n))
    lock2.acquire()
    print('运行-:' + str(num - n))
    lock2.release()
    lock1.release()
c = []
for y in range(10):
    tt = t.Thread(target=xc, args=(y,))
    tt.start()
    c.append(tt)
for x in c:
    x.join()

2) RLock

Recursive lock allows the same thread to acquire the lock multiple times.

n = 10
lock1 = t.RLock()
lock2 = t.RLock()
def xc(num):
    lock1.acquire()
    print('运行+:' + str(num + n))
    lock2.acquire()
    print('运行-:' + str(num - n))
    lock2.release()
    lock1.release()
c = []
for y in range(10):
    tt = t.Thread(target=xc, args=(y,))
    tt.start()
    c.append(tt)
for x in c:
    x.join()

Using with statement:

with lock:
    for i in range(10):
        print(i)
# Equivalent:
if lock.acquire(1):
    for i in range(10):
        print(i)
    lock.release()

3) Condition lock

Condition objects allow threads to wait for certain conditions.

Condition.acquire(*args)      # Acquire lock
Condition.wait(timeout=None)  # Wait for notification
Condition.notify(num)          # Wake up up to num waiting threads
Condition.notify_all()        # Wake all waiting threads
def ww(c):
    with c:
        print('init')
        c.wait(timeout=5)
        print('end')
def xx(c):
    with c:
        print('nono')
        c.notifyAll()
        print('start')
        c.notify(1)
c = t.Condition()
t.Thread(target=ww, args=(c,)).start()
t.Thread(target=xx, args=(c,)).start()

5. Semaphores

Semaphores control access to a limited number of resources.

1) BoundedSemaphore

# Create bounded semaphore with initial value
b = t.BoundedSemaphore(value=1)
# Acquire (decrement)
BoundedSemaphore.acquire(blocking=True, timeout=None)
# Release (increment)
BoundedSemaphore.release()
# Current value
BoundedSemaphore._value

2) Semaphore (unbounded)

Unbounded semaphore does not enforce an upper limit.

6. Event

Events provide a simple flag for thread communication.

event.set()      # Set flag to True
event.clear()    # Set flag to False
event.is_set()   # Check flag
event.wait(timeout=None)  # Wait until flag is True or timeout
import time
e = t.Event()
def ff(num):
    while True:
        if num < 5:
            e.clear()
            print('清空')
        if num >= 5:
            e.wait(timeout=1)
            e.set()
            print('启动')
            if e.isSet():
                e.clear()
                print('停止')
        if num == 10:
            e.wait(timeout=3)
            e.clear()
            print('退出')
            break
        num += 1
        time.sleep(2)
ff(1)

7. Local

Thread-local storage provides variables unique to each thread.

l = t.local()
def ff(num):
    l.x = 100
    for y in range(num):
        l.x += 3
    print(str(l.x))
for y in range(10):
    t.Thread(target=ff, args=(y,)).start()

8. Timer

Timer schedules a function to run after a specified interval, optionally repeatedly.

t.Timer(num, func, *args, **kwargs)  # Create timer
def f():
    print('start')
    global t
    tt = t.Timer(3, f)
    tt.start()
f()

Conclusion

By thoroughly analyzing threads, we see their importance in simplifying complex problems; they are especially useful for web crawling and other tasks. This article covers all major thread concepts.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonSynchronizationthreading
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.