How to Scrape Douyin Comments and Create a Dancing Word Cloud with Python

This guide walks you through analyzing Douyin's web pages, bypassing its anti‑scraping checks, constructing the proper API URLs, fetching comment data via Python, storing the results, and visualizing the comments as a dynamic dancing word cloud.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Scrape Douyin Comments and Create a Dancing Word Cloud with Python

1. Page Analysis

Open the Douyin web interface, search for the target creator, and navigate to the creator's video list page. Select a video, scroll down to locate the comment section.

After opening the video page, continue scrolling until the comment area becomes visible.

Open the browser developer tools (F12) and inspect the network requests to discover how comments are loaded.

The comments are fetched via an XHR request that returns JSON data.

2. Bypass Anti‑Scraping

Douyin uses an information‑validation anti‑scraping mechanism. By logging in and supplying the appropriate cookies, the request can be authenticated and the data accessed.

3. Build the Request URL

Each page of comments is identified by a cursor parameter that increments by 20. Construct the URL with the required query parameters, including device_platform, aid, aweme_id, and the dynamic cursor.

Send the Request

import requests
for cursor in range(10):
    params = (
        ('device_platform', 'webapp'),
        ('aid', '6383'),
        ('channel', 'channel_pc_web'),
        ('aweme_id', '7034396984236657923'),
        ('cursor', str(20 + cursor * 20)),
        ('count', '20'),
        ('version_code', '170400'),
        ('version_name', '17.4.0'),
        ('cookie_enabled', 'true'),
        ('screen_width', '1920'),
        ('screen_height', '1080'),
        ('browser_language', 'zh-CN'),
        ('browser_name', 'Chrome'),
        ('browser_version', '96.0.4664.45'),
        ('browser_online', 'true'),
        ('engine_name', 'Blink'),
        ('engine_version', '96.0.4664.45'),
        ('os_name', 'Windows'),
        ('os_version', '10'),
        ('cpu_core_num', '4'),
        ('device_memory', '8'),
        ('platform', 'PC'),
        ('downlink', '3.4'),
        ('effective_type', '4g'),
        ('round_trip_time', '50'),
        ('msToken', 'xO8ykiImW4_y1P17rjjV82tkToK8sdVUSXsck7dqlo5egXnsLielL_-gNoh0eTlNzohikTmdqccSsY3Es0-we3HmgJYX-jaWe7rO1uKCGLQSCz4tUKiWsZwpNQ=='),
        ('X-Bogus', 'DFSzswVuuEtANasbSiKxme9WX7j6'),
        ('_signature', '_02B4Z6wo00001cGgFBAAAIDBQaLuUEhgJOnBoBCAABHQa2zGW56-brVbd8zPJMMr5zV9wMRK.Fw-baUMHl14.I7n6EC4lETZbOyGYyoi08uVzPer1kHjbwJPWfXZBARPia3I0l-u0HyASZI012')
    )
    response = requests.get('https://www.douyin.com/aweme/v1/web/comment/list/', headers=headers, params=params)

Store the Data

r = response.json()['comments']
for i in r:
    with open('comment.txt', 'a') as f:
        f.write(i['text'])

4. Dancing Word Cloud

After collecting the comment texts, generate a word cloud that animates like a dancing figure. The following GIF demonstrates the result.

5. Summary

Step‑by‑step instructions for extracting Douyin video comments using Python.

Techniques for bypassing anti‑scraping checks by providing login cookies.

Code snippets for sending requests, parsing JSON, and saving comments.

Visualization of the comments as a dancing word cloud.

Readers can modify the aweme_id parameter to scrape other videos.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data visualizationWeb ScrapingDouyinword cloudComment Extraction
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.