Fundamentals 7 min read

Using Scrapeasy: A One‑Line Python Library for Web Scraping and Media Download

This article introduces Scrapeasy, a Python library that enables one‑line web scraping, extraction of images, videos, PDFs and other files, and demonstrates installation, basic usage, subpage link retrieval, media downloading, and advanced link handling with clear code examples.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Using Scrapeasy: A One‑Line Python Library for Web Scraping and Media Download

If you are looking for a powerful Python web‑scraping tool, Scrapeasy lets you start crawling and extracting data with just a single line of code.

Scrapeasy is a Python library that can easily scrape web pages and extract data, supporting single‑page or multi‑page crawling as well as extracting data from PDF and HTML tables.

Key features include one‑click site crawling (not limited to a single page), built‑in handling of common scraping tasks (links, images, videos), and the ability to retrieve special file types such as .php or .pdf .

Installation

<code>$ pip install scrapeasy</code>

Basic usage

<code>from scrapeasy import Website, Page</code>

Initialize a website

<code>web = Website("https://tikocash.com/solange/index.php/2022/04/13/how-do-you-control-irrational-fear-and-overthinking/")</code>

Retrieve all sub‑page links:

<code>links = web.getSubpagesLinks()</code>

Note that the returned URLs may lack the http://www. prefix, which you should add when making actual requests.

Find images

<code>images = web.getImages()</code>

The response contains links to all available images on the site.

Download media

<code>web.download("img", "fahrschule/images")</code>

This command downloads every image (identified by the keyword img ) to the specified local folder.

Retrieve links

<code>domains = web.getLinks(intern=False, extern=False, domain=True)</code>

To get external links without domain filtering:

<code>domains = web.getLinks(intern=False, extern=True, domain=False)</code>

Working with a single page

<code>w3 = Page("https://www.w3schools.com/html/html5_video.asp")</code>

Download all videos from the page:

<code>w3.download("video", "w3/videos")</code>
<code>video_links = w3.getVideos()</code>

Download other file types such as PDFs or PHP files:

<code>calendar_links = Page("https://tikocash.com").get("php")</code>
<code>Page("http://mathcourses.ch/mat182.html").download("pdf", "mathcourses/pdf-files")</code>

In summary, Python’s versatility and Scrapeasy’s one‑line API make it a powerful tool for web scraping and data mining, allowing you to extract virtually any content from a website within seconds.

PythonautomationTutorialWeb ScrapingScrapeasydata-extraction
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.