Bilibili Tech
Aug 27, 2024 · Artificial Intelligence
Multimodal Video Scene Classification for Adaptive Video Processing
The paper presents a multimodal video scene classification system that leverages CLIP‑generated pseudo‑labels and a fine‑tuned image encoder to automatically identify nature, animation/game, and document scenes, enabling more effective adaptive transcoding, intelligent restoration, and quality assessment for user‑generated content on platforms such as Bilibili.
Bilibili multimediaClipResNet
0 likes · 17 min read