Simulating Zhihu Login with Python Using urllib and Fiddler
This article demonstrates how to automate Zhihu login on Windows by analyzing network traffic with Fiddler, extracting required parameters, and implementing a Python script that builds HTTP requests using urllib2, handles cookies, captcha retrieval, and logs the results, complete with sample code and execution screenshots.
The tutorial begins by outlining the development environment (Windows 7, Python 2.7.5) and tools (Chrome, Fiddler) used to capture the HTTP requests made during a Zhihu login.
It then describes the three‑step simulation process: (1) monitor client‑server communication with Chrome’s Network panel or Fiddler, (2) extract the login URL and required parameters (_xsrf, email/phone_num, password, captcha, remember_me), and (3) programmatically send a POST request using Python.
Key request details such as the POST URL ( https://www.zhihu.com/login/email or /login/phone_num ), hidden fields, and headers (Accept, Content‑Type, X‑Requested‑With, Referer, User‑Agent, etc.) are identified from the captured traffic.
The provided Python source defines a WSpider class that initializes a cookie jar, builds an opener, and offers methods like setRequestData , getHtmlText , saveCaptcha , and output . An example script creates a logger, fetches the homepage to obtain the _xsrf token, prompts the user for credentials and captcha, constructs the POST data, sends the request, parses the JSON response, and saves the resulting page HTML.
Running the script (e.g., python zhiHuLogin.py ) yields three possible outcomes illustrated with screenshots: password error, captcha error, or successful login, after which the authenticated homepage HTML is saved for further crawling.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.