Big Data 19 min read

Build a Million‑Follower Bilibili Nickname Generator with Python Scraping

This article demonstrates how to crawl Bilibili creator data, analyze fan counts, categories, gender and video statistics with Python and pandas, and then create a nickname generator for aspiring million‑follower up‑hosts using both Python and JavaScript.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Build a Million‑Follower Bilibili Nickname Generator with Python Scraping

The author shares a method to crawl Bilibili up data, analyze millions of creators, and build a nickname generator for aspiring million‑follower creators.

Source: CSDN – Author: 小小明 – Original article

Bilibili up information crawling

Directly scraping Bilibili homepage is inconvenient, so two third‑party data sites are used: 火烧云数据 and 小小数据. After logging in, the API URLs are copied. pip install filestools -U Run the conversion tool to obtain ready‑to‑use Python crawling code. curl2py The generated code is copied to the editor; for those who dislike command‑line, replace the copied curl command with xxx in the script below.

from curl2py.curlParseTool import curlCmdGenPyScript

curl_cmd = """xxx"""
output = curlCmdGenPyScript(curl_cmd)
print(output)

Data analysis

Data reading and preprocessing

import pandas as pd

names = ["名称","性别","签名","视频数量","粉丝数","播放数","点赞数","总充电人数","月充电人数","生日","category1","category2","tags"]

df = pd.read_csv("b站up主粉丝量top10万.csv", usecols=[2,3,5]+list(range(9,16))+[22,23,24], header=0, names=names, low_memory=False)
df.drop_duplicates(inplace=True)
df.sort_values("粉丝数", ascending=False, inplace=True)
df

The dataset contains 100 000 rows; official Bilibili accounts are removed, leaving 99 955 entries.

Category distribution

Counting the category1 field shows that Life and Game are the most common categories among creators with over 10 000 fans.

Gender differences

Overall gender counts: 65 900 secret, 20 452 male, 13 648 female. Male creators are about 50 % more than female, and the male‑to‑female ratio increases with higher fan tiers.

Video count distribution

The average number of videos per creator is 258, with a maximum of 180 033; many creators have zero videos yet still have fans.

df.视频数量.describe()
# count    59167.0
# mean       258.36
# std      1379.66
# min          0.0
# 25%         38.0
# 50%         89.0
# 75%        213.0
# max    180033.0

Birthday distribution

Birth month analysis reveals a large spike in January birthdays, possibly due to the default selection in Bilibili’s profile settings.

Million‑follower nickname generator

Filter creators with fan count ≥ 100 000 (million‑level after conversion) – 658 entries. Name length distribution shows 4‑character names are most common.

name_size = df.名称.apply(len)
name_size.value_counts().head(10)
# 4     158
# 5     131
# 3      88
# 7      70
# 6      70

Extract words longer than one character using jieba, yielding 1 068 unique tokens.

import jieba
names = df.名称.apply(jieba.lcut).explode()
names = names[names.apply(len) > 1].unique()
print(names.shape[0], names)
# 1068 ['罗翔' '刑法' '番茄' ...]

Generate a short nickname by randomly picking two tokens (≈4 characters).

"".join(np.random.choice(names, 2))
# Example output: 张逗麦克

JavaScript version for users without Python:

var items = ['罗翔','刑法','番茄','敬汉卿',/* many more */];
items[Math.floor(Math.random()*items.length)] + items[Math.floor(Math.random()*items.length)];

Running the script in a browser console produces random, catchy nicknames such as “张逗麦克”.

Conclusion

The analysis provides actionable insights into Bilibili creator demographics and demonstrates a practical pipeline—from crawling data to generating personalized nicknames—useful for anyone aiming to become a high‑fan‑count up‑host.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Web ScrapingNickname Generator
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.