Create a Dancing Word Cloud from Bilibili Videos with Python – Full Step‑by‑Step Guide

This tutorial walks you through building a Python project that downloads a Bilibili video, extracts its frames, applies Baidu AI human segmentation, scrapes danmu comments, generates a stylized word‑cloud animation, and finally composes a video with background music, showcasing video processing, AI, and data visualization techniques.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Create a Dancing Word Cloud from Bilibili Videos with Python – Full Step‑by‑Step Guide

Import Required Modules

Install necessary Python packages (lxml, requests, pandas, numpy, you-get, opencv-python, jieba, fake_useragent, matplotlib, moviepy, etc.) using a script that runs pip install for each library.

import os, time
libs = {"lxml","requests","pandas","numpy","you-get","opencv-python","fake_useragent","matplotlib","moviepy"}
for lib in libs:
    os.system(f"pip3 install -i https://pypi.doubanio.com/simple/ {lib}")
    print(lib + "下载成功")

Download Bilibili Video

Use you-get to download the target video from Bilibili.

pip install you-get
you-get -i https://www.bilibili.com/video/BV11C4y1h7nX
Downloaded video
Downloaded video

Split Video into Frames

Read the video with OpenCV and save each frame as an image.

import cv2
cap = cv2.VideoCapture(r"video.flv")
num = 1
while True:
    ret, frame = cap.read()
    if ret:
        cv2.imwrite(f"./pictures/img_{num}.jpg", frame)
        num += 1
    else:
        break
cap.release()
Video frames
Video frames

Human Segmentation with Baidu AI

Call Baidu Body Analysis API to obtain a mask for each frame and save the segmented foreground.

from aip import AipBodyAnalysis
import base64, numpy as np, cv2, os
client = AipBodyAnalysis(APP_ID, API_KEY, SECRET_KEY)
for img_path in os.listdir('./pictures'):
    with open(os.path.join('./pictures', img_path), 'rb') as fp:
        img_info = fp.read()
    seg_res = client.bodySeg(img_info)
    labelmap = base64.b64decode(seg_res['labelmap'])
    labelimg = cv2.imdecode(np.frombuffer(labelmap, np.uint8), 1)
    mask = np.where(labelimg == 1, 255, 0)
    cv2.imwrite(f"./mask_img/mask_{img_path}.png", mask)
Segmentation result
Segmentation result

Scrape Bilibili Danmu (Comments)

Fetch historical danmu data for a given date range, clean the text, and store it in an Excel file.

def Grab_barrage(date):
    headers = {
        "origin": "https://www.bilibili.com",
        "referer": "https://www.bilibili.com/video/BV1jZ4y1K78N",
        "cookie": "",
        "user-agent": ua.random()
    }
    params = {"type": 1, "oid": "222413092", "date": date}
    r = requests.get(url, params=params, headers=headers)
    comments = re.findall(r'<d p=".*?">(.*?)</d>', r.text)
    # Append comments to DataFrame and save to Excel

Generate Word Cloud

Load the danmu text, perform jieba segmentation, remove stop words, and create a word‑cloud image with a custom mask.

from wordcloud import WordCloud
import jieba, collections, numpy as np, cv2
text = open('barrages.txt', encoding='utf-8').read()
words = jieba.cut(text, cut_all=True)
stop_words = set(open('stoplist.txt', encoding='utf-8').read().split('
'))
filtered = [w for w in words if w not in stop_words and len(w) > 1]
freq = collections.Counter(filtered)
mask = np.array(cv2.imread('mask.png'))
wc = WordCloud(mask=mask, background_color='black', font_path='simhei.ttf').generate_from_frequencies(freq)
wc.to_file('wordcloud.png')
Word cloud
Word cloud

Compose Final Video

Combine the generated word‑cloud images into a video with OpenCV, then add background music using moviepy.

import cv2
videoWriter = cv2.VideoWriter('result.mp4', cv2.VideoWriter_fourcc('M','P','4','V'), 30, (1920,1080))
for i in range(1, 89):
    frame = cv2.imread(f'./wordcloud/wordcloud_{i}.png')
    videoWriter.write(cv2.resize(frame, (1920,1080)))
videoWriter.release()

import moviepy.editor as mpy
clip = mpy.VideoFileClip('result.mp4')
audio = mpy.AudioFileClip('song.mp3').subclip(0,25)
final = clip.set_audio(audio)
final.write_videofile('final_video.mp4')
Final video
Final video

Result

The completed video shows a dancing word cloud synchronized with background music, demonstrating the full pipeline from video download to AI‑based segmentation and visual storytelling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonVideo processingOpenCVBilibiliAI segmentationmoviepyword cloud
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.