Understanding Synchronous vs Asynchronous Programming in Python with asyncio and aiohttp
This article explains the limitations of Python's GIL, compares synchronous and asynchronous execution models, introduces asyncio and aiohttp, provides concrete code examples for single and multiple concurrent HTTP requests, and offers solutions for common concurrency errors such as too many file descriptors.
Python's Global Interpreter Lock (GIL) prevents true multi‑core utilization, so for I/O‑bound network programming asynchronous processing can improve performance by hundreds or thousands of times compared to synchronous code.
Python 3.4 added asyncio to the standard library, and Python 3.5 introduced the async / await syntax, making asynchronous programming more accessible.
Concept of Synchronous vs Asynchronous
Synchronous execution runs tasks sequentially, waiting for each to finish before starting the next. Asynchronous execution starts a task and immediately proceeds to the next, using callbacks, notifications, or futures to handle results later.
Asyncio Example
Below is a simple synchronous implementation that sleeps for one second in each iteration:
<code>import time
def hello():
time.sleep(1)
def run():
for i in range(5):
hello()
print(time.time())
if __name__ == '__main__':
run()
</code>And the equivalent asynchronous version using asyncio :
<code>import asyncio
import time
async def hello():
await asyncio.sleep(1)
print(time.time())
def run():
for i in range(5):
loop.run_until_complete(hello())
loop = asyncio.get_event_loop()
if __name__ == '__main__':
run()
</code>Using aiohttp for Concurrent HTTP Requests
To perform asynchronous HTTP requests, replace the synchronous requests library with aiohttp . First create a ClientSession and use it to send GET/POST/etc. requests.
<code>async with ClientSession() as session:
async with session.get(url='http://www.baidu.com/') as response:
# await response.read() to get the body
pass
</code>Multiple URL Access with asyncio
When many URLs need to be fetched, wrap each request in an async function, schedule them as Future objects, and let the event loop run them concurrently.
<code>#!/usr/bin/env python
import asyncio
from aiohttp import ClientSession
tasks = []
async def request_run(i):
async with ClientSession() as session:
async with session.get(url='http://www.baidu.com/') as response:
data = await response.read()
print(i)
return data
def run():
for i in range(5):
task = asyncio.ensure_future(request_run(i))
tasks.append(task)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
run()
loop.run_until_complete(asyncio.wait(tasks))
</code>Collecting All Responses
Use asyncio.gather(*tasks) to aggregate results from multiple coroutines into a list.
<code>#!/usr/bin/env python
import asyncio
from aiohttp import ClientSession
tasks = []
async def request_run(i):
async with ClientSession() as session:
async with session.get(url='http://www.baidu.com/') as response:
data = await response.read()
print(i)
return data
def run():
for i in range(5):
task = asyncio.ensure_future(request_run(i))
tasks.append(task)
results = loop.run_until_complete(asyncio.gather(*tasks))
print(results)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
run()
</code>Handling "ValueError: too many file descriptors in select()"
When concurrency reaches thousands of tasks, the OS limit on open file descriptors can be hit. Three remedies are suggested:
Limit the number of concurrent tasks (e.g., cap at 500).
Use a callback‑based approach.
Increase the OS file‑descriptor limit (not detailed here).
A practical example limits concurrency with an asyncio.Semaphore set to 500:
<code>#!/usr/bin/env python
import asyncio, time
from aiohttp import ClientSession
tasks = []
a = time.time()
async def request_run(i, semaphore):
async with semaphore:
async with ClientSession() as session:
async with session.get(url='https://segmentfault.com/q/1010000011211509') as response:
data = await response.read()
tasks.append(i)
async def run():
semaphore = asyncio.Semaphore(500)
to_get = [request_run(i, semaphore) for i in range(1000)]
await asyncio.wait(to_get)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(run())
loop.close()
print(tasks)
b = time.time()
print(b - a)
</code>The article concludes with a reminder that the material is for learning purposes only.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.