Tagged articles
5 articles
Page 1 of 1
DataFunSummit
DataFunSummit
Jan 20, 2024 · Artificial Intelligence

Cross‑Modal Video Open‑Tag Mining: Techniques, Methods, and Applications

The article presents a comprehensive overview of cross‑modal video open‑tag mining, detailing its technical background, related multimodal research methods, a four‑stage open‑tag solution from 360 AI Research Institute, and future application prospects such as unsupervised tag coverage, semantic retrieval, and content moderation.

Cross-modalMultimodal AIlabel extraction
0 likes · 15 min read
Cross‑Modal Video Open‑Tag Mining: Techniques, Methods, and Applications
Meituan Technology Team
Meituan Technology Team
Apr 14, 2022 · Artificial Intelligence

Short Video Content Understanding and Generation Practices at Meituan

Meituan leverages computer‑vision techniques to tag, analyze, and automatically generate short videos across consumer and merchant scenarios, detailing hierarchical tag design, self‑supervised representation learning, fine‑grained food recognition, intelligent cover creation, and pixel‑level editing to enhance content discovery and presentation.

AI content generationComputer Visionfine-grained recognition
0 likes · 20 min read
Short Video Content Understanding and Generation Practices at Meituan
NetEase Media Technology Team
NetEase Media Technology Team
Apr 11, 2022 · Artificial Intelligence

Multimodal Video Tagging: Challenges and a Two‑Stage Recall‑Ranking Solution

To tackle the massive, multimodal tagging challenge of short‑video platforms—characterized by a huge long‑tail tag set, sparse annotations, and uneven modality contributions—the authors propose a two‑stage recall‑ranking system that first retrieves candidates via text, visual, audio and classification cues, then refines them with contrastive learning and extensive hard‑negative sampling, achieving 0.884 tag accuracy in a real‑world news video recommender.

EmbeddingMultimodal LearningRecommendation Systems
0 likes · 12 min read
Multimodal Video Tagging: Challenges and a Two‑Stage Recall‑Ranking Solution
Youku Technology
Youku Technology
May 18, 2020 · Artificial Intelligence

How Feature-Induced Manifold Disambiguation Improves Video Tagging in Multi-View Learning

The paper "Feature‑Induced Manifold Disambiguation for Multi‑view Partial Multi‑label Learning" accepted at KDD 2020 introduces the MVPML framework and the FIMAN method, which leverage heterogeneous multimodal features to correct and supplement video tags, thereby boosting distribution efficiency in Alibaba Entertainment’s platforms.

Alibaba EntertainmentKDD 2020manifold disambiguation
0 likes · 3 min read
How Feature-Induced Manifold Disambiguation Improves Video Tagging in Multi-View Learning
DataFunTalk
DataFunTalk
Sep 29, 2019 · Artificial Intelligence

UC Information Flow Video Tag Recognition: System Architecture and Multi‑Modal Algorithms

This article presents a comprehensive overview of UC's information‑flow video tag recognition technology, detailing tag usage scenarios, the end‑to‑end system architecture, multi‑modal feature extraction, advanced deep‑learning models such as NextVlad, behavior and person tagging methods, and future research directions.

Computer VisionDeep LearningMultimodal Learning
0 likes · 14 min read
UC Information Flow Video Tag Recognition: System Architecture and Multi‑Modal Algorithms