How to Scrape All Your Toutiao Friends’ Nicknames with Python

This step‑by‑step guide shows how to log into Toutiao, locate the fan list via browser dev tools, capture the AJAX request URLs, decode Unicode nicknames, paginate through all pages, and save the results as JSON using Python.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Scrape All Your Toutiao Friends’ Nicknames with Python

Preface

Hello, I’m Huang Wei. Curious about how to scrape all my Toutiao friends, I decided to explore the process.

Project Goal

Obtain the nicknames of all Toutiao friends.

Project Setup

Editor: Sublime Text 3

Browser: 360 Browser with a Toutiao account

Experiment Steps

1. Log into your Toutiao account

Open the fan list page and inspect the element to locate the fan container.

2. Find the fan‑list API

Open the Network panel; look for requests named get_info_list ("获取信息列表").

3. Load all requests

Scroll the page to trigger AJAX loading until the useful responses appear.

4. Analyze the API and decode Unicode

The nicknames are returned as Unicode escape sequences. Two ways to decode them in Python:

ss='\u4e00\u8def\u5411\u897f8635'
print((ss.encode('utf8')).decode())
print(eval('u"%s"' % ss))

5. Retrieve all pages

Send requests for each page using the discovered parameters and save the JSON responses.

6. Guess pagination parameters

From the captured traffic, the pagesize equals the number of fans displayed (e.g., 2599). However, the server only accepts a maximum of 200 per request, so we must split the requests accordingly.

7. Print nicknames and valid URLs

Collect the smallest and largest cursor values (e.g., 1570591241~1589072863) and iterate over that range to fetch all pages.

Store successful responses in a text file, read each line, parse the JSON, and output the nicknames.

Import the built‑in json module to load the data:

Project Summary

Working with Toutiao’s AJAX responses and encrypted data shows that web scraping often requires JavaScript reverse‑engineering; without it, many sites remain inaccessible. Keep learning and experimenting.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonJSONUnicodeWeb ScrapingajaxToutiao
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.