Big Data 11 min read

How Python Data Mining Uncovers Why '30 Only' Became a Summer Hit

This article uses Python to scrape and analyze Douban ratings, user comments, and Tencent video danmu for the TV drama “30 Only”, revealing the show’s explosive popularity, the most discussed characters, and audience sentiment through statistical charts and word‑cloud visualizations.

Python Crawling & Data Mining

Aug 8, 2020

How Python Data Mining Uncovers Why '30 Only' Became a Summer Hit

Introduction

The Chinese drama “30 Only” dominated social media and search trends during the summer, prompting a data‑driven investigation into why it resonated so strongly with viewers.

Data Sources

Two main sources were used:

Douban – rating scores, short‑review counts and comment data.

Tencent Video – over 271,000 danmu (real‑time comments) collected from the 15 episodes.

Data Analysis

Douban Rating

The series accumulated more than 42.2 billion reads and 148.8 k discussion posts on Weibo, with an average Douban score of 8.0, which is high for domestic productions.

Comment Word Cloud

Word‑cloud analysis of Douban short reviews highlighted the keywords “female”, “plot”, “like” and frequent mentions of actors Jiang Shuying , Tong Yao and Mao Xiaotong .

Danmu Analysis

From Tencent Video we extracted 271,049 danmu (average 18,069 per episode, roughly 401 per minute). The following steps were performed in Python:

Data acquisition and loading with pandas.

Pre‑processing to extract character tags and classify users.

Visualization of results.

Key findings:

Most mentioned characters: Wang Manni , Gu Jia , Zhong Xiaoqin .

VIP users were identified by the presence of these character tags.

Visualization

Three main charts were generated using pyecharts:

Pie chart showing the distribution of user levels (VIP, ordinary, unknown).

Bar chart ranking the popularity of danmu characters.

Word‑clouds for each major character (Wang Manni, Gu Jia, Zhong Xiaoqin, Chen Yu, Xu Huanshan).

# Import libraries
import os
import jieba
import numpy as np
import pandas as pd
from pyecharts.charts import Bar, Pie, WordCloud
from pyecharts import options as opts

# Read data files
data_list = os.listdir('../data/')
df_all = pd.DataFrame()
for i in data_list:
    if i.split('.')[-1] == 'csv':
        df_one = pd.read_csv(f'../data/{i}', engine='python', encoding='utf-8', index_col=0)
        df_all = df_all.append(df_one, ignore_index=False)

# Extract character tags
pattern = r'(王漫妮\s*|钟晓芹\s*|顾佳\s*|陈屿\s*|许幻山\s*|飒飒\s*|浪浪\s*):.*'
df_all['danmu_role'] = df_all['content'].str.extract(pattern)[0].str.strip()

def transform_name(x):
    if x in ['王漫妮', '顾佳', '钟晓芹', '陈屿', '许幻山', '飒飒', '浪浪']:
        return 'VIP用户'
    elif x == 'NaN':
        return '未知用户'
    else:
        return '普通用户'

df_all['danmu_level'] = df_all['danmu_role'].apply(transform_name)

Conclusion

The analysis shows that “30 Only” struck a chord by portraying three distinct 30‑year‑old women and their dilemmas, which resonated with a large female audience. Data‑driven insights such as character popularity, sentiment keywords, and user‑level distribution help explain the show’s viral success and illustrate how Python can turn entertainment data into actionable knowledge.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Python data mining Tencent Danmu TV Drama Analysis

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.