Artificial Intelligence 11 min read

Create a Dancing Word Cloud from Bilibili Videos with Python – Full Step‑by‑Step Guide

This tutorial walks you through building a Python project that downloads a Bilibili video, extracts its frames, applies Baidu AI human segmentation, scrapes danmu comments, generates a stylized word‑cloud animation, and finally composes a video with background music, showcasing video processing, AI, and data visualization techniques.

Python Programming Learning Circle

Jul 8, 2025

Create a Dancing Word Cloud from Bilibili Videos with Python – Full Step‑by‑Step Guide

Import Required Modules

Install necessary Python packages (lxml, requests, pandas, numpy, you-get, opencv-python, jieba, fake_useragent, matplotlib, moviepy, etc.) using a script that runs pip install for each library.

import os, time
libs = {"lxml","requests","pandas","numpy","you-get","opencv-python","fake_useragent","matplotlib","moviepy"}
for lib in libs:
    os.system(f"pip3 install -i https://pypi.doubanio.com/simple/ {lib}")
    print(lib + "下载成功")

Download Bilibili Video

Use you-get to download the target video from Bilibili.

pip install you-get
you-get -i https://www.bilibili.com/video/BV11C4y1h7nX

Split Video into Frames

Read the video with OpenCV and save each frame as an image.

import cv2
cap = cv2.VideoCapture(r"video.flv")
num = 1
while True:
    ret, frame = cap.read()
    if ret:
        cv2.imwrite(f"./pictures/img_{num}.jpg", frame)
        num += 1
    else:
        break
cap.release()

Human Segmentation with Baidu AI

Call Baidu Body Analysis API to obtain a mask for each frame and save the segmented foreground.

from aip import AipBodyAnalysis
import base64, numpy as np, cv2, os
client = AipBodyAnalysis(APP_ID, API_KEY, SECRET_KEY)
for img_path in os.listdir('./pictures'):
    with open(os.path.join('./pictures', img_path), 'rb') as fp:
        img_info = fp.read()
    seg_res = client.bodySeg(img_info)
    labelmap = base64.b64decode(seg_res['labelmap'])
    labelimg = cv2.imdecode(np.frombuffer(labelmap, np.uint8), 1)
    mask = np.where(labelimg == 1, 255, 0)
    cv2.imwrite(f"./mask_img/mask_{img_path}.png", mask)

Scrape Bilibili Danmu (Comments)

Fetch historical danmu data for a given date range, clean the text, and store it in an Excel file.

def Grab_barrage(date):
    headers = {
        "origin": "https://www.bilibili.com",
        "referer": "https://www.bilibili.com/video/BV1jZ4y1K78N",
        "cookie": "",
        "user-agent": ua.random()
    }
    params = {"type": 1, "oid": "222413092", "date": date}
    r = requests.get(url, params=params, headers=headers)
    comments = re.findall(r'<d p=".*?">(.*?)</d>', r.text)
    # Append comments to DataFrame and save to Excel

Generate Word Cloud

Load the danmu text, perform jieba segmentation, remove stop words, and create a word‑cloud image with a custom mask.

from wordcloud import WordCloud
import jieba, collections, numpy as np, cv2
text = open('barrages.txt', encoding='utf-8').read()
words = jieba.cut(text, cut_all=True)
stop_words = set(open('stoplist.txt', encoding='utf-8').read().split('
'))
filtered = [w for w in words if w not in stop_words and len(w) > 1]
freq = collections.Counter(filtered)
mask = np.array(cv2.imread('mask.png'))
wc = WordCloud(mask=mask, background_color='black', font_path='simhei.ttf').generate_from_frequencies(freq)
wc.to_file('wordcloud.png')

Compose Final Video

Combine the generated word‑cloud images into a video with OpenCV, then add background music using moviepy.

import cv2
videoWriter = cv2.VideoWriter('result.mp4', cv2.VideoWriter_fourcc('M','P','4','V'), 30, (1920,1080))
for i in range(1, 89):
    frame = cv2.imread(f'./wordcloud/wordcloud_{i}.png')
    videoWriter.write(cv2.resize(frame, (1920,1080)))
videoWriter.release()

import moviepy.editor as mpy
clip = mpy.VideoFileClip('result.mp4')
audio = mpy.AudioFileClip('song.mp3').subclip(0,25)
final = clip.set_audio(audio)
final.write_videofile('final_video.mp4')

Result

The completed video shows a dancing word cloud synchronized with background music, demonstrating the full pipeline from video download to AI‑based segmentation and visual storytelling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Video Processing opencv Bilibili AI segmentation moviepy word cloud

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.