How to Scrape All Your Toutiao Friends’ Nicknames with Python
This step‑by‑step guide shows how to log into Toutiao, locate the fan list via browser dev tools, capture the AJAX request URLs, decode Unicode nicknames, paginate through all pages, and save the results as JSON using Python.
Preface
Hello, I’m Huang Wei. Curious about how to scrape all my Toutiao friends, I decided to explore the process.
Project Goal
Obtain the nicknames of all Toutiao friends.
Project Setup
Editor: Sublime Text 3
Browser: 360 Browser with a Toutiao account
Experiment Steps
1. Log into your Toutiao account
Open the fan list page and inspect the element to locate the fan container.
2. Find the fan‑list API
Open the Network panel; look for requests named get_info_list ("获取信息列表").
3. Load all requests
Scroll the page to trigger AJAX loading until the useful responses appear.
4. Analyze the API and decode Unicode
The nicknames are returned as Unicode escape sequences. Two ways to decode them in Python:
ss='\u4e00\u8def\u5411\u897f8635'
print((ss.encode('utf8')).decode())
print(eval('u"%s"' % ss))5. Retrieve all pages
Send requests for each page using the discovered parameters and save the JSON responses.
6. Guess pagination parameters
From the captured traffic, the pagesize equals the number of fans displayed (e.g., 2599). However, the server only accepts a maximum of 200 per request, so we must split the requests accordingly.
7. Print nicknames and valid URLs
Collect the smallest and largest cursor values (e.g., 1570591241~1589072863) and iterate over that range to fetch all pages.
Store successful responses in a text file, read each line, parse the JSON, and output the nicknames.
Import the built‑in json module to load the data:
Project Summary
Working with Toutiao’s AJAX responses and encrypted data shows that web scraping often requires JavaScript reverse‑engineering; without it, many sites remain inaccessible. Keep learning and experimenting.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
