How AI‑Powered Follow‑Bullet Screens Transform Interactive Video
This article explains the AI‑driven follow‑bullet screen feature that attaches dynamic comment bubbles to detected faces in video, detailing its three‑layer architecture, implementation challenges, and future extensions for richer interactive experiences.
Introduction
When watching videos with bullet comments, a new AI‑based follow‑bullet screen attaches comment bubbles to characters' faces, moving with them and offering richer interaction than traditional scrolling comments.
Follow‑Bullet Screen Architecture
The system consists of three layers: an algorithm side, a server side, and a client side.
Algorithm Layer
Video frame extraction at 25 fps (configurable) to obtain frames for processing.
Model training using multi‑angle character images to build a face library.
Face detection on each frame to obtain coordinates.
Face tracking to link the same face across consecutive frames.
Smoothing to reduce jitter in the face trajectory.
Server Layer
Noise reduction to filter out transient faces.
Anti‑shake processing to further smooth trajectories.
Merging frame‑level metadata into continuous face tracks.
Generating bubble‑style bullet data linked to each face.
Client Layer
Interactive SDK loads scripts that define interactions such as rating, tips, etc.
Face scripts contain trajectory coordinates and associated bubble data; a timer polls the current time to display the bubble next to the face.
Why Not Perform Face Detection on the Client?
Real‑time constraints: full pipeline (tracking, smoothing, merging) would be too slow for live playback.
Insufficient accuracy on mobile SDKs leads to missed detections and requires costly frame interpolation.
Higher CPU usage increases power consumption and can cause player stutter.
Tricky Issues
Bullet persistence across scene cuts: When a user sends a bullet just before a cut, the face disappears. The solution is to fade out the bullet after the cut, preserving visibility without lingering on the next scene.
Data misalignment after video edits: Frequent cuts in variety shows shift face metadata relative to the timeline. By processing a short segment around the edit point, the exact millisecond offset can be calculated and applied to all subsequent data.
Future Outlook
Beyond follow‑bullet screens, the face and body data will become basic scripts for other interactions such as bullet‑through‑people, personalized overlays, or automated content moderation like face‑based mosaicking.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
