Boost Python I/O Performance: Mastering Coroutines and Async/Await
This article explains Python coroutine fundamentals, async/await syntax, and demonstrates how to use aiohttp with an event loop and semaphore to dramatically speed up I/O‑bound network requests, providing practical code and performance results.
Concept Overview
Python is widely used for tasks such as web crawling and network requests, but its single‑threaded nature limits performance. Coroutines (also called micro‑threads) allow cooperative multitasking within a single thread, improving I/O‑bound throughput.
Fundamental OS Concepts
Understanding processes, threads, synchronization, asynchrony, blocking and non‑blocking I/O provides the foundation for using coroutines effectively.
Coroutine Basics
A coroutine is a lightweight thread‑like construct that can be paused and resumed by the programmer, enabling explicit context switches without the overhead of OS threads or locks.
Principles
Coroutines run in a single thread, sharing its resources. The event loop schedules coroutine execution; only one coroutine runs at a time while others are suspended.
Async/Await
Python 3.5 introduced async and await syntax as syntactic sugar for coroutine definitions and suspension points, built on top of the asyncio framework.
Future and Task
A Future represents a pending result; Task wraps a coroutine, managing its state and interaction with the event loop.
Event Loop
The event loop must be created in the main thread; other threads need to set it explicitly with asyncio.set_event_loop(). The loop itself cannot be controlled directly from Python code, limiting multi‑threaded coroutine execution.
Practical Example
Using aiohttp to send many HTTP requests concurrently replaces the serial requests approach. The sample code creates a semaphore‑controlled pool, retries failed requests up to four times, and processes responses via callbacks.
import aiohttp
import asyncio
from inspect import isfunction
import time
import logger
@logging_utils.exception(logger)
def request(pool, data_list):
loop = asyncio.get_event_loop()
loop.run_until_complete(exec(pool, data_list))
async def exec(pool, data_list):
tasks = []
sem = asyncio.Semaphore(pool)
for item in data_list:
tasks.append(
control_sem(sem,
item.get("method", "GET"),
item.get("url"),
item.get("data"),
item.get("headers"),
item.get("callback")))
await asyncio.wait(tasks)
async def control_sem(sem, method, url, data, headers, callback):
async with sem:
count = 0
flag = False
while not flag and count < 4:
flag = await fetch(method, url, data, headers, callback)
count = count + 1
print("flag:{},count:{}".format(flag, count))
if count == 4 and not flag:
raise Exception('EAS service not responding after 4 times of retry.')
async def fetch(method, url, data, headers, callback):
async with aiohttp.request(method, url=url, data=data, headers=headers) as resp:
try:
json = await resp.read()
print(json)
if resp.status != 200:
return False
if isfunction(callback):
callback(json)
return True
except Exception as e:
print(e)Result
Processing 1,000 items dropped from 816 s to 424 s, roughly a 2× speed‑up; larger pools give even better performance, limited by the third‑party service’s connection caps.
Conclusion
Coroutines dramatically improve I/O‑bound Python programs; consider them for similar workloads.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
