Build a Simple Python Image Scraper on macOS – Step‑by‑Step Guide
This tutorial walks you through setting up a macOS environment, inspecting a web page, and writing a Python script with the requests library to locate and download all images from a target site, complete with code explanations and execution tips.
Introduction
The article provides a beginner‑friendly guide for creating a Python web‑scraper that downloads images from a website, aimed at someone with little programming experience. It covers environment preparation on a Mac, using Chrome’s developer tools to locate image URLs, and writing a complete script with detailed explanations.
Environment Setup
Since the Mac already includes Python, the only additional requirement is the requests library. Install it via the terminal (e.g., pip install requests) after opening the Terminal application.
Libraries are pre‑written code modules that let you reuse functionality without reinventing the wheel.
Preparing the Target
Open the target website (e.g., the “Qdaily” news site) in Chrome, right‑click on the page and choose Inspect Element to view the source code. Use the magnifying‑glass tool to highlight images and observe that their src attributes follow a pattern such as <img class="pic" src="…". The alt attribute contains the article title.
Understanding this pattern allows the scraper to extract the image URLs reliably.
Writing the Scraper
Use Visual Studio Code (or any editor) to create a file named picdownloader.py. The script includes the following key parts: #-*-coding:utf8-*- – declares UTF‑8 encoding for the source file. import re and import requests – import the regular‑expression and HTTP‑request libraries.
Fetch the page source and store it in a variable, e.g., html = requests.get('http://www.qdaily.com/...').text.
Extract image URLs with a regular expression: "pic" src="(.*?)", storing matches in pic_url.
Iterate over each URL:
for each in pic_url:
url = 'http://www.qdaily.com' + each
print('now downloading:' + url)
pic = requests.get(url)
with open('pic/' + each.split('/')[-1], 'wb') as f:
f.write(pic.content)The loop concatenates the base site URL with the relative path, prints a progress message, downloads the image, and saves it into a local pic folder.
Running the Program
Before execution, create a pic directory in the script’s root folder to store the downloaded images. Then run the script from the terminal with python picdownloader.py. The terminal will display progress messages for each image.
Following these steps yields a functional image‑scraping tool that can be adapted to other sites by adjusting the URL and regular‑expression pattern.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
