Fetching JS-Rendered Pages in Python with DrissionPage, Playwright, Selenium
This article explains how to overcome JavaScript-rendered web pages when scraping with Python, comparing tools like Selenium, Playwright, and the DrissionPage library, and provides a concrete code example that fetches and prints page HTML, along with practical tips for handling JSON data sources.
1. Introduction
Hello, I am a Python enthusiast. Recently a fan asked how to scrape a website that uses anti‑scraping measures and renders its content with JavaScript, so the page source appears empty. The discussion attracted many readers, and we will explore how to obtain the original HTML generated by JavaScript.
The fan specifically needs the raw HTML generated by JavaScript, not the data after the page is rendered. Below is the solution.
2. Implementation Process
Experts pointed out that in asynchronous pages the tags and data are not present in the static source; the data is delivered as JSON and assembled by JavaScript. To capture the rendered source you can use automation tools such as Selenium (not recommended), Playwright, or DrissionPage.
from DrissionPage import WebPage
page = WebPage()
# Access the page and render it
page.get('https://fx.cmbchina.com/hq', timeout=300)
response = page.html
print(response)
page.quit()Opening the network panel shows that the page initially loads without content; all data is rendered by JavaScript. Directly visiting the URL returns an empty page.
After rendering, the page displays a table whose data originates from a JSON endpoint. Fetching the JSON directly is far more efficient than scraping the rendered HTML. Modern sites often insert data via DOM, which is both fast and secure.
Another contributor noted that combining DrissionPage with its built‑in listener or with mitmproxy can solve many similar problems.
The fan's issue was resolved successfully.
3. Conclusion
This article examined a Python web‑scraping challenge involving JavaScript‑rendered pages, presented a concrete code solution using DrissionPage, and offered practical advice for extracting JSON data directly, helping the community overcome similar obstacles.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
