Short Video Tagging Using Neural Networks
The paper presents a gated‑attention neural network that fuses audio, visual, and title text features to automatically generate high‑quality tags for short videos, achieving state‑of‑the‑art performance on the YouTube‑8M challenge and enabling scalable tagging and recommendation services with future plans for broader tag coverage and temporal segment tagging.
This technical article discusses the development of neural network models for short video tagging, addressing the challenge of automatically generating relevant tags for short video content. The research focuses on leveraging audio-visual features and text data to improve tagging accuracy and efficiency.
The authors propose a gated attention neural network architecture that combines audio and video feature aggregation with text features. This model achieved state-of-the-art performance in the YouTube-8M video understanding challenge, outperforming existing single-model solutions by 0.3 percentage points in Global Average Precision (GAP). The system successfully implemented in practical applications covers thousands of high-quality content tags and dozens of category tags.
Key innovations include a gated attention mechanism for feature aggregation, which learns the importance of different feature components through a bottleneck structure. The model also incorporates text features from video titles processed by neural networks. Current implementations are deployed in internal business applications like short video tagging and recommendation systems, providing stable tagging services.
Future directions include expanding tag coverage, improving feature extraction models, and developing specialized models for underrepresented tag types. The research also explores extending capabilities from video-level tagging to precise time-segment tagging for longer videos.
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.