Master Python‑JS Integration with PyExecJS, Js2Py & PyV8 for Web Scraping

This article introduces three Python libraries—PyExecJS, Js2Py, and PyV8—that enable execution of JavaScript within Python, showing how to set up runtimes, run code snippets, read JS files, and scrape encrypted web pages for more effective data extraction.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Master Python‑JS Integration with PyExecJS, Js2Py & PyV8 for Web Scraping

Introduction

We know Python can easily perform many tasks and even build simple GUIs, but executing JavaScript—especially scripts that encrypt web pages—requires a bridge. This article introduces three Python modules that can run JavaScript code and help decode protected content.

1. PyExecJS

PyExecJS is a Python library that can execute JavaScript scripts and interact with the JavaScript on web pages, making it possible to decrypt encrypted content that plain HTTP requests cannot handle. It relies on an external JavaScript runtime such as Node.js.

1.1 Basic usage

import execjs
aa = execjs.eval("'one|two|three'.split('|')")
print(aa)

e = execjs.compile('''
function add(x,y){
  return x+y;
}
''')
print(e.call('add',10,20))

You can also run code directly via the engine:

print(execjs.get().eval('1+1'))

1.2 Inspecting the runtime

print(execjs.get().name)

The default engine may be JScript; you can switch to Node.js by setting the environment variable:

import os
os.environ["EXECJS_RUNTIME"] = "Node"
print(execjs.get().name)

2. Js2Py

Js2Py is a pure‑Python solution that does not require an external JavaScript runtime. It can execute JS files, though it may be slower.

2.1 Basic usage

import js2py
aa = js2py.eval_js('''
var i=0;
for(var c=1;c<6;c++){
  console.log(c);
}
''')
print(aa)

2.2 Reading a JS file

function f(aa){
    if(aa>11){
        console.log('OK');
    }else{
        console.log('Fail');
    }
}
import js2py
with open('1.js','r') as f:
    aa = js2py.eval_js(f.read())
    print(aa(11))

2.3 Scraping a website

Example that fetches a Taobao page, extracts embedded JavaScript, modifies it, and executes it with PyExecJS to obtain the decrypted data:

import execjs, requests, re
url = 'https://ai.taobao.com/?pid=mm_26632323_6762370_25910879'
res = requests.get(url).text
js = re.findall(r'<script>(.*?)</script>', res)
js1 = re.sub(r'eval\(', 'return(', js[0])
html = "function getLego2WPK(){" + js1 + "};"
ctx = execjs.compile(html)
temp = ctx.call('getLego2WPK')
print(temp)

3. PyV8

PyV8 wraps Google’s V8 engine but only works with Python 2 and is no longer maintained, so it is generally not recommended.

Conclusion

The article reviews three Python libraries—PyExecJS, Js2Py, and PyV8—that enable execution of JavaScript from Python, allowing more effective web scraping and decryption of protected content.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaScriptPythonWeb Scrapingjs2pyPyexecjsPyV8
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.