Backend Development 4 min read

How to Fix Common XPath Errors in Python Web Scraping – A Step-by-Step Guide

An experienced Python developer walks through a real-world web‑scraping issue, showing how a faulty XPath selector caused empty results, then provides corrected code, execution screenshots, and best practices like adding request headers, helping readers quickly resolve similar problems.

Python Crawling & Data Mining

Sep 2, 2022

How to Fix Common XPath Errors in Python Web Scraping – A Step-by-Step Guide

Introduction

In a Python community a member asked about a web‑scraping selector issue; the original XPath did not return the expected results.

Problem

The original code used the following XPath (shown in the image) and printed the result, but the output was incorrect.

from lxml import etree
import requests
url = "http://www.xiaohua.com/duanzi/"

resp = requests.get(url)
html = etree.HTML(resp.text)

print('*---*'*20)

result = html.xpath("/html/body/div[@class='main']/div[@class='content']/div[@class='grid clearfix']/div[@class='content-left']/div[@class='one-cont'][*]/p[@class='fonts']")
print(type(result))
print(result)
print('*-*'*20)
b = 0
for i in result:
    b += 1
    print(i,len(result))
    print(b,etree.tostring(i).decode('utf-8'))
    if b > 1:
        break

The issue was identified as an incorrect XPath expression.

Solution

A community member provided a corrected XPath and updated code (illustrated in the image). After running the revised script the desired joke text is extracted correctly.

Key points include adding appropriate request headers to avoid being blocked.

Conclusion

The article demonstrates how to diagnose and fix XPath problems in Python web scraping, offering a concrete example and best‑practice tips for reliable data extraction.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Development XPath lxml

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.