Fundamentals 11 min read

Master Python File Downloads: Requests, Wget, urllib, Async & More

This tutorial walks through multiple Python approaches for downloading files—including simple requests and wget calls, handling redirects, large and multi‑file downloads, proxy usage, urllib/urllib3 methods, and asynchronous techniques—providing complete code snippets and practical tips for each scenario.

21CTO

Apr 21, 2020

Master Python File Downloads: Requests, Wget, urllib, Async & More

Using Requests

Download a file by calling requests.get(url) and writing myfile.content to a local file.

import requests
url = 'https://www.python.org/static/img/[email protected]'
myfile = requests.get(url)
open('c:/users/21cto/downloads/PythonImage.png', 'wb').write(myfile.content)

Using wget

Install the wget module with pip install wget and download a file via wget.download(url, path).

import wget
url = "https://www.python.org/static/img/[email protected]"
wget.download(url, 'c:/users/LikeGeeks/downloads/pythonLogo.png')

Downloading Redirected Files

Enable redirects with allow_redirects=True when using requests.get, then write the content to a file.

import requests
url = 'https://readthedocs.org/projects/python-guide/downloads/pdf/latest/'
myfile = requests.get(url, allow_redirects=True)
open('c:/users/21cto/documents/PythonBook.pdf', 'wb').write(myfile.content)

Downloading Large Files

Stream the response and write it in chunks to avoid loading the entire file into memory.

import requests
url = 'https://www.python.org/static/img/[email protected]'
myfile = requests.get(url, stream=True)
with open('c:/users/21cto/downloads/PythonImage.png', 'wb') as f:
    for chunk in myfile.iter_content(chunk_size=1024):
        if chunk:
            f.write(chunk)

Parallel / Batch Downloads

Use ThreadPool from multiprocessing.pool to download multiple URLs concurrently.

import os, requests, time
from multiprocessing.pool import ThreadPool

def url_response(item):
    path, url = item
    r = requests.get(url, stream=True)
    with open(path, 'wb') as f:
        for chunk in r:
            f.write(chunk)

urls = [
    ("c:/users/21cto/file1.pdf", "https://example.com/file1.pdf"),
    ("c:/users/21cto/file2.pdf", "https://example.com/file2.pdf"),
    # ... more URLs ...
]
start = time.time()
ThreadPool(9).imap_unordered(url_response, urls)
print(f"Time to download: {time.time() - start}")

Using urllib

The standard library urllib.request.urlretrieve can download a URL directly to a local path.

import urllib.request
urllib.request.urlretrieve('https://www.python.org/', 'c:/users/21cto/documents/PythonOrganization.html')

Downloading via Proxy

Create a ProxyHandler and build an opener to route requests through a proxy server.

import urllib.request
myProxy = urllib.request.ProxyHandler({'http': '127.0.0.2'})
openProxy = urllib.request.build_opener(myProxy)
urllib.request.urlretrieve('https://www.python.org/', 'c:/users/21cto/documents/PythonOrg.html')
# Using requests with proxies
myProxy = {'http': 'http://127.0.0.2:3001'}
requests.get('https://www.python.org/', proxies=myProxy)

Using urllib3

Install urllib3 and use its PoolManager to fetch content, then write it with shutil.copyfileobj.

pip install urllib3
import urllib3, shutil
c = urllib3.PoolManager()
url = 'https://www.python.org/'
filename = 'mytest.txt'
with c.request('GET', url, preload_content=False) as res, open(filename, 'wb') as out_file:
    shutil.copyfileobj(res, out_file)

Asynchronous Downloads

Leverage asyncio to run multiple download coroutines concurrently.

import asyncio, urllib.request

async def coroutine(url):
    r = urllib.request.urlopen(url)
    filename = "coroutine_download.txt"
    with open(filename, 'wb') as f:
        for chunk in r:
            f.write(chunk)
    return 'Download succeeded'

async def main_func(urls):
    tasks = [coroutine(u) for u in urls]
    downloaded, _ = await asyncio.wait(tasks)
    for d in downloaded:
        print(d.result())

urls_to_download = [
    "https://www.python.org/events/python-events/801/",
    "https://www.python.org/events/python-events/790/",
    "https://www.python.org/events/python-user-group/816/",
    "https://www.python.org/events/python-events/757/"
]
loop = asyncio.get_event_loop()
loop.run_until_complete(main_func(urls_to_download))

These examples demonstrate a wide range of Python techniques for downloading files, from simple synchronous calls to advanced asynchronous and parallel solutions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python requests asyncio urllib file-download

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.