Short Video Content Tagging: Multimodal AI Model Framework and Applications
The framework tags short videos by fusing text, image and audio‑video features through specialized extraction, classification, generative and retrieval modules, then ranking candidates with a multimodal BERT model, delivering accurate, business‑specific tags that boost recommendation, search and advertising.