Data Analysis and Visualization of Bilibili Documentary Metadata
This article demonstrates how to collect, process, and visualize Bilibili documentary metadata using Python APIs, pandas, and various plotting libraries, revealing insights into regional distribution, genre trends, episode lengths, popularity metrics, and comment dynamics across Chinese, British, and American documentary collections.
We first obtain the Bilibili documentary index page, inspect network requests, and identify the search API endpoint
https://bangumi.bilibili.com/media/web_api/search/result?...&page={3}&pagesize=20and the season detail API
https://bangumi.bilibili.com/view/web_api/season?season_id={0}.
Using these APIs we crawl basic information (title, year, season_id, media_id) into a list, convert it to a pandas DataFrame, and then request detailed fields such as danmakus, favorites, views, coins, area, episodes_duration, style, and episodes_aid for each season.
The collected data are saved to documentary_data_allinfo.csv and later re‑loaded for analysis.
Visualization steps include:
Region distribution via a pie chart of documentary counts per area.
Genre comparison for China, the UK, and the US using stacked bar charts and a co‑occurrence network graph.
Yearly documentary count trends across the three regions.
Episode length and season total duration distributions using histograms, violin plots, and 2‑D kernel density estimates.
Popularity analysis with a bubble chart that combines views, danmakus, coins, and favorites into a weighted score.
Comment‑time line plots for selected series, showing weekly comment dynamics.
Overall the analysis reveals that Chinese documentaries are dominated by history, society, and humanities topics with shorter episodes, while British and American productions feature more technology and nature content with longer episodes; popularity correlates strongly with coin and danmaku metrics.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
