Backend Development 5 min read

Fetching JS-Rendered Pages in Python with DrissionPage, Playwright, Selenium

This article explains how to overcome JavaScript-rendered web pages when scraping with Python, comparing tools like Selenium, Playwright, and the DrissionPage library, and provides a concrete code example that fetches and prints page HTML, along with practical tips for handling JSON data sources.

Python Crawling & Data Mining

Apr 10, 2024

Fetching JS-Rendered Pages in Python with DrissionPage, Playwright, Selenium

1. Introduction

Hello, I am a Python enthusiast. Recently a fan asked how to scrape a website that uses anti‑scraping measures and renders its content with JavaScript, so the page source appears empty. The discussion attracted many readers, and we will explore how to obtain the original HTML generated by JavaScript.

The fan specifically needs the raw HTML generated by JavaScript, not the data after the page is rendered. Below is the solution.

2. Implementation Process

Experts pointed out that in asynchronous pages the tags and data are not present in the static source; the data is delivered as JSON and assembled by JavaScript. To capture the rendered source you can use automation tools such as Selenium (not recommended), Playwright, or DrissionPage.

from DrissionPage import WebPage

page = WebPage()
# Access the page and render it
page.get('https://fx.cmbchina.com/hq', timeout=300)
response = page.html
print(response)
page.quit()

Opening the network panel shows that the page initially loads without content; all data is rendered by JavaScript. Directly visiting the URL returns an empty page.

After rendering, the page displays a table whose data originates from a JSON endpoint. Fetching the JSON directly is far more efficient than scraping the rendered HTML. Modern sites often insert data via DOM, which is both fast and secure.

Another contributor noted that combining DrissionPage with its built‑in listener or with mitmproxy can solve many similar problems.

The fan's issue was resolved successfully.

3. Conclusion

This article examined a Python web‑scraping challenge involving JavaScript‑rendered pages, presented a concrete code solution using DrissionPage, and offered practical advice for extracting JSON data directly, helping the community overcome similar obstacles.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Web Scraping Playwright Selenium drissionpage JS Rendering

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.