Tagged articles
61 articles
Page 1 of 1
Python Crawling & Data Mining
Python Crawling & Data Mining
Aug 26, 2025 · Fundamentals

Mastering Python Web Scraping: Clean Price Extraction with XPath Tricks

This article walks through a Python web‑scraping problem, demonstrates why the original XPath extraction returns noisy or missing price data, and provides multiple refined code solutions—including filtering empty entries, correcting XPath selectors, and using map‑filter techniques—to produce clean, formatted price lists.

Pythondata cleaninglxml
0 likes · 6 min read
Mastering Python Web Scraping: Clean Price Extraction with XPath Tricks
Code Mala Tang
Code Mala Tang
Apr 19, 2025 · Fundamentals

Master HTML Parsing in Python: BeautifulSoup, lxml, and html.parser Compared

Learn why HTML parsing is essential for web scraping, explore three popular Python libraries—BeautifulSoup, lxml, and the built‑in html.parser—covering installation, core usage, advanced techniques, and a comparative analysis to help you choose the right tool for your project.

Pythonbeautifulsouphtml-parsing
0 likes · 11 min read
Master HTML Parsing in Python: BeautifulSoup, lxml, and html.parser Compared
Python Crawling & Data Mining
Python Crawling & Data Mining
Apr 18, 2024 · Backend Development

How to Scrape GDP Data with Python and Save to CSV in Minutes

This article demonstrates how to use Python's requests, lxml, and pandas libraries to crawl GDP data from a website, parse the HTML tables, and efficiently write the extracted rankings, regions, GDP values, and years into a CSV file, providing a complete, runnable example for web scraping beginners.

CSVPythonWeb Scraping
0 likes · 8 min read
How to Scrape GDP Data with Python and Save to CSV in Minutes
Python Crawling & Data Mining
Python Crawling & Data Mining
Feb 13, 2023 · Backend Development

Master Python Web Scraping & Data Extraction with Requests, lxml, pandas

This article walks through a Python web‑scraping solution that fetches GDP data from a website using the requests library, parses HTML with lxml, and demonstrates two approaches—manual XPath extraction and a streamlined pandas.read_html method—while providing complete code snippets and tips for handling pagination and data storage.

Data ExtractionWeb Scrapinglxml
0 likes · 6 min read
Master Python Web Scraping & Data Extraction with Requests, lxml, pandas
Python Programming Learning Circle
Python Programming Learning Circle
Sep 30, 2021 · Backend Development

Python Web Scraper for VIP Anime Collection

This article demonstrates how to build a Python web scraper using requests, lxml, regular expressions, and tqdm to locate, extract, and download video files from a VIP anime website, covering header configuration, XPath parsing, URL reconstruction, and file saving.

animelxmltqdm
0 likes · 6 min read
Python Web Scraper for VIP Anime Collection
Python Crawling & Data Mining
Python Crawling & Data Mining
Apr 5, 2021 · Fundamentals

Master XPath and lxml: A Complete Guide to XML Parsing in Python

This article provides a comprehensive tutorial on XPath concepts, node types, syntax, axes, predicates, and operators, followed by a detailed introduction to the fast Python lxml library, its installation, usage for both offline and online HTML parsing, and practical code examples for extracting elements and attributes.

PythonXML parsinglxml
0 likes · 11 min read
Master XPath and lxml: A Complete Guide to XML Parsing in Python
Python Crawling & Data Mining
Python Crawling & Data Mining
Dec 31, 2020 · Backend Development

How to Scrape Thousands of New‑House Listings in Python: A Step‑by‑Step Guide

This article demonstrates how to use Python's requests, fake_useragent, and lxml libraries to batch‑scrape nearly a thousand new‑house listings from the 惠民之家 website, extracting 41 fields such as name, price, layout, opening date, plot ratio and green ratio, while handling pagination and anti‑scraping measures.

CSVPythonReal Estate Data
0 likes · 9 min read
How to Scrape Thousands of New‑House Listings in Python: A Step‑by‑Step Guide
Python Programming Learning Circle
Python Programming Learning Circle
Dec 19, 2020 · Fundamentals

XPath Basics and lxml Usage in Python

This article introduces the fundamentals of XPath syntax, common rules, and example expressions, then explains how to use the lxml library in Python for HTML/XML parsing, including practical tips and a complete code example for extracting links and text from a sample document.

PythonWeb ScrapingXML
0 likes · 6 min read
XPath Basics and lxml Usage in Python
Python Crawling & Data Mining
Python Crawling & Data Mining
Jun 5, 2020 · Backend Development

Build a Python Image Scraper for 51miz.com in Minutes

This tutorial walks you through creating a Python web scraper that fetches image URLs from 51miz.com using requests and lxml, filters them with regular expressions, downloads the images, and demonstrates the complete workflow with code snippets and screenshots.

PythonWeb ScrapingXPath
0 likes · 5 min read
Build a Python Image Scraper for 51miz.com in Minutes
Python Crawling & Data Mining
Python Crawling & Data Mining
Jun 2, 2020 · Backend Development

How to Build a Python Scraper for Youdao Mobile Translation API

This tutorial walks you through using Python's requests and lxml libraries to reverse‑engineer the Youdao mobile translation interface, construct the required form parameters, send POST requests, parse the returned HTML with XPath, and display translated results for multiple languages.

Translation APIlxmlweb-scraping
0 likes · 6 min read
How to Build a Python Scraper for Youdao Mobile Translation API
Python Crawling & Data Mining
Python Crawling & Data Mining
Nov 21, 2019 · Backend Development

Essential Python Web Scraping Libraries Every Developer Should Know

This guide introduces the most important Python libraries for web scraping—including requests, urllib3, Selenium, aiohttp, BeautifulSoup, lxml, pyquery, PyMySQL, PyMongo, and redisdump—explaining their core features, typical use cases, and providing concise code examples to help beginners get started quickly.

aiohttpbeautifulsouplxml
0 likes · 7 min read
Essential Python Web Scraping Libraries Every Developer Should Know
MaGe Linux Operations
MaGe Linux Operations
Jul 2, 2019 · Backend Development

Master Web Scraping with BeautifulSoup: A Complete Python Guide

This tutorial introduces BeautifulSoup, a powerful Python library for parsing HTML and XML, covering installation, basic usage, tag selection, attribute extraction, navigation of parent and sibling nodes, method and CSS selectors, and best‑practice recommendations for efficient web data extraction.

Data ExtractionPythonWeb Scraping
0 likes · 30 min read
Master Web Scraping with BeautifulSoup: A Complete Python Guide