Master Python‑JS Integration with PyExecJS, Js2Py & PyV8 for Web Scraping
This article introduces three Python libraries—PyExecJS, Js2Py, and PyV8—that enable execution of JavaScript within Python, showing how to set up runtimes, run code snippets, read JS files, and scrape encrypted web pages for more effective data extraction.
Introduction
We know Python can easily perform many tasks and even build simple GUIs, but executing JavaScript—especially scripts that encrypt web pages—requires a bridge. This article introduces three Python modules that can run JavaScript code and help decode protected content.
1. PyExecJS
PyExecJS is a Python library that can execute JavaScript scripts and interact with the JavaScript on web pages, making it possible to decrypt encrypted content that plain HTTP requests cannot handle. It relies on an external JavaScript runtime such as Node.js.
1.1 Basic usage
import execjs
aa = execjs.eval("'one|two|three'.split('|')")
print(aa)
e = execjs.compile('''
function add(x,y){
return x+y;
}
''')
print(e.call('add',10,20))You can also run code directly via the engine:
print(execjs.get().eval('1+1'))1.2 Inspecting the runtime
print(execjs.get().name)The default engine may be JScript; you can switch to Node.js by setting the environment variable:
import os
os.environ["EXECJS_RUNTIME"] = "Node"
print(execjs.get().name)2. Js2Py
Js2Py is a pure‑Python solution that does not require an external JavaScript runtime. It can execute JS files, though it may be slower.
2.1 Basic usage
import js2py
aa = js2py.eval_js('''
var i=0;
for(var c=1;c<6;c++){
console.log(c);
}
''')
print(aa)2.2 Reading a JS file
function f(aa){
if(aa>11){
console.log('OK');
}else{
console.log('Fail');
}
} import js2py
with open('1.js','r') as f:
aa = js2py.eval_js(f.read())
print(aa(11))2.3 Scraping a website
Example that fetches a Taobao page, extracts embedded JavaScript, modifies it, and executes it with PyExecJS to obtain the decrypted data:
import execjs, requests, re
url = 'https://ai.taobao.com/?pid=mm_26632323_6762370_25910879'
res = requests.get(url).text
js = re.findall(r'<script>(.*?)</script>', res)
js1 = re.sub(r'eval\(', 'return(', js[0])
html = "function getLego2WPK(){" + js1 + "};"
ctx = execjs.compile(html)
temp = ctx.call('getLego2WPK')
print(temp)3. PyV8
PyV8 wraps Google’s V8 engine but only works with Python 2 and is no longer maintained, so it is generally not recommended.
Conclusion
The article reviews three Python libraries—PyExecJS, Js2Py, and PyV8—that enable execution of JavaScript from Python, allowing more effective web scraping and decryption of protected content.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
