Create a Dancing Word Cloud from Bilibili Videos with Python – Full Step‑by‑Step Guide
This tutorial walks you through building a Python project that downloads a Bilibili video, extracts its frames, applies Baidu AI human segmentation, scrapes danmu comments, generates a stylized word‑cloud animation, and finally composes a video with background music, showcasing video processing, AI, and data visualization techniques.
Import Required Modules
Install necessary Python packages (lxml, requests, pandas, numpy, you-get, opencv-python, jieba, fake_useragent, matplotlib, moviepy, etc.) using a script that runs pip install for each library.
import os, time
libs = {"lxml","requests","pandas","numpy","you-get","opencv-python","fake_useragent","matplotlib","moviepy"}
for lib in libs:
os.system(f"pip3 install -i https://pypi.doubanio.com/simple/ {lib}")
print(lib + "下载成功")Download Bilibili Video
Use you-get to download the target video from Bilibili.
pip install you-get
you-get -i https://www.bilibili.com/video/BV11C4y1h7nXSplit Video into Frames
Read the video with OpenCV and save each frame as an image.
import cv2
cap = cv2.VideoCapture(r"video.flv")
num = 1
while True:
ret, frame = cap.read()
if ret:
cv2.imwrite(f"./pictures/img_{num}.jpg", frame)
num += 1
else:
break
cap.release()Human Segmentation with Baidu AI
Call Baidu Body Analysis API to obtain a mask for each frame and save the segmented foreground.
from aip import AipBodyAnalysis
import base64, numpy as np, cv2, os
client = AipBodyAnalysis(APP_ID, API_KEY, SECRET_KEY)
for img_path in os.listdir('./pictures'):
with open(os.path.join('./pictures', img_path), 'rb') as fp:
img_info = fp.read()
seg_res = client.bodySeg(img_info)
labelmap = base64.b64decode(seg_res['labelmap'])
labelimg = cv2.imdecode(np.frombuffer(labelmap, np.uint8), 1)
mask = np.where(labelimg == 1, 255, 0)
cv2.imwrite(f"./mask_img/mask_{img_path}.png", mask)Scrape Bilibili Danmu (Comments)
Fetch historical danmu data for a given date range, clean the text, and store it in an Excel file.
def Grab_barrage(date):
headers = {
"origin": "https://www.bilibili.com",
"referer": "https://www.bilibili.com/video/BV1jZ4y1K78N",
"cookie": "",
"user-agent": ua.random()
}
params = {"type": 1, "oid": "222413092", "date": date}
r = requests.get(url, params=params, headers=headers)
comments = re.findall(r'<d p=".*?">(.*?)</d>', r.text)
# Append comments to DataFrame and save to ExcelGenerate Word Cloud
Load the danmu text, perform jieba segmentation, remove stop words, and create a word‑cloud image with a custom mask.
from wordcloud import WordCloud
import jieba, collections, numpy as np, cv2
text = open('barrages.txt', encoding='utf-8').read()
words = jieba.cut(text, cut_all=True)
stop_words = set(open('stoplist.txt', encoding='utf-8').read().split('
'))
filtered = [w for w in words if w not in stop_words and len(w) > 1]
freq = collections.Counter(filtered)
mask = np.array(cv2.imread('mask.png'))
wc = WordCloud(mask=mask, background_color='black', font_path='simhei.ttf').generate_from_frequencies(freq)
wc.to_file('wordcloud.png')Compose Final Video
Combine the generated word‑cloud images into a video with OpenCV, then add background music using moviepy.
import cv2
videoWriter = cv2.VideoWriter('result.mp4', cv2.VideoWriter_fourcc('M','P','4','V'), 30, (1920,1080))
for i in range(1, 89):
frame = cv2.imread(f'./wordcloud/wordcloud_{i}.png')
videoWriter.write(cv2.resize(frame, (1920,1080)))
videoWriter.release()
import moviepy.editor as mpy
clip = mpy.VideoFileClip('result.mp4')
audio = mpy.AudioFileClip('song.mp3').subclip(0,25)
final = clip.set_audio(audio)
final.write_videofile('final_video.mp4')Result
The completed video shows a dancing word cloud synchronized with background music, demonstrating the full pipeline from video download to AI‑based segmentation and visual storytelling.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
