Create a Dancing Word‑Cloud Video with Python and AI
This tutorial walks through downloading a dance video, extracting frames, using Baidu AI for person segmentation, generating word‑cloud masks, and stitching the results into a dancing word‑cloud video with Python, OpenCV and the WordCloud library.
In this guide we show how to build a dancing word‑cloud video using Python. The workflow includes downloading a dance clip, splitting it into frames, extracting the human figure with Baidu AI, creating word‑cloud masks, and finally compositing the masks into a new video.
1. Download the source video
Use the you-get tool to fetch any online video (e.g., from Bilibili):
you-get url2. Split the video into frames
OpenCV reads the video and saves every N‑th frame as an image:
import cv2
vc = cv2.VideoCapture(r'美女跳舞视频.flv')
n = 1
timeF = 10 # save one frame every 10 frames
num = 0
while True:
ret, frame = vc.read()
if not ret:
break
if n % timeF == 0:
num += 1
cv2.imwrite(f'{num}.jpg', frame)
n += 1
cv2.waitKey(1)
vc.release()Sample extracted frame:
3. Person segmentation with Baidu AI
Create a Baidu AI application, obtain AppID , API Key and Secret Key , then call the body‑segmentation API on each frame:
APP_ID = '你的APP_ID'
API_KEY = '你的API_KEY'
SECRET_KEY = '你的SECRET_KEY'
client = AipBodyAnalysis(APP_ID, API_KEY, SECRET_KEY)
path = r'美女跳舞视频'
img_files = os.listdir(r'img')
for num in range(2, len(img_files) + 1):
img = f'img_{num}.jpg'
with open(img, 'rb') as fp:
img_info = fp.read()
seg_res = client.bodySeg(img_info)
labelmap = base64.b64decode(seg_res['labelmap'])
nparr = np.frombuffer(labelmap, np.uint8)
labelimg = cv2.imdecode(nparr, 1)
labelimg = cv2.resize(labelimg, (width, height), interpolation=cv2.INTER_NEAREST)
mask = np.where(labelimg == 1, 255, labelimg)
cv2.imwrite(f'mask_{num}.png', mask)Console screenshots of the Baidu AI console (app creation, API keys, and body‑analysis page) are omitted for brevity.
4. Generate word‑cloud masks
Read comments, cut words with jieba, and render a word cloud using each mask image:
for num in range(1, 23):
with open('comment.txt', 'r') as f:
text = f.read()
words = jieba.cut(text)
word_str = " ".join(words)
mask = 255 - np.array(Image.open(f'mask_{num}.png'))
wc = WordCloud(stopwords=STOPWORDS.add('一个'), collocations=False,
background_color='white', font_path=r"K:\msyh.ttc",
width=400, height=300, random_state=42, mask=mask)
wc.generate(word_str)
wc.to_file(f'ciyun_{num}.png')Example word‑cloud result:
5. Assemble the final video
Combine the generated word‑cloud images into a video with OpenCV:
import cv2
video_address = 'tiaowu.mp4'
fps = 20
img_size = (1080, 1920)
fourcc = cv2.VideoWriter_fourcc('M','P','4','V')
videoWriter = cv2.VideoWriter(video_address, fourcc, fps, img_size)
for num in range(1, 23):
img_path = f'ciyun_{num}.png'
frame = cv2.resize(cv2.imread(img_path), img_size)
videoWriter.write(frame)
videoWriter.release()Resulting dancing word‑cloud video (GIF preview):
6. Summary
The article demonstrates a complete pipeline: download a dance video, extract frames, segment the dancer using Baidu AI, create word‑cloud masks from comments, and stitch the masks into a new video, providing a visual “dancing word cloud” effect.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
