How Feature-Induced Manifold Disambiguation Improves Video Tagging in Multi-View Learning

The paper "Feature‑Induced Manifold Disambiguation for Multi‑view Partial Multi‑label Learning" accepted at KDD 2020 introduces the MVPML framework and the FIMAN method, which leverage heterogeneous multimodal features to correct and supplement video tags, thereby boosting distribution efficiency in Alibaba Entertainment’s platforms.

Youku Technology
Youku Technology
Youku Technology
How Feature-Induced Manifold Disambiguation Improves Video Tagging in Multi-View Learning

Conference Overview

The ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) is a premier international venue for data‑mining research. In its 2020 edition, held in San Diego, California, the Research Track received 1,279 valid submissions and accepted 216 papers, yielding an acceptance rate of approximately 16.9%.

Paper Acceptance and Authors

The collaborative work between Alibaba Entertainment’s MoCo Lab and Southeast University’s PALM Lab, titled Feature‑Induced Manifold Disambiguation for Multi‑view Partial Multi‑label Learning , was selected for the KDD 2020 Research Track. The authors are Jing‑Han Wu, Xuan Wu, Qing‑Guo Chen, Yao Hu, and Min‑Ling Zhang.

Problem Statement

In short‑video distribution, the accuracy and completeness of user‑generated tags are critical for effective recommendation and search. However, because many uploaders are non‑professional, tags often contain significant bias and omissions, degrading the overall distribution efficiency.

Proposed Framework and Method

The paper abstracts a Multi‑View Partial Multi‑label (MVPML) representation framework that captures the rich multimodal information inherent in video content (e.g., visual, audio, textual cues). Building on this framework, the authors propose the FIMAN (Feature‑Induced Manifold Disambiguation) method , which exploits the manifold structure induced by heterogeneous features to disambiguate and refine partial multi‑label data.

Key steps of FIMAN include:

Extracting modality‑specific features from video data.

Constructing a feature‑induced manifold that reflects the intrinsic relationships among samples.

Applying manifold‑based disambiguation to correct erroneous tags and supplement missing ones.

Results and Impact

Experimental evaluation demonstrates that FIMAN effectively improves tag correctness and completeness, leading to higher video distribution efficiency. The method has already been deployed across various scenarios within Alibaba Entertainment, enhancing recommendation quality and user engagement.

video taggingKDD 2020Alibaba Entertainmentmulti-view learningmanifold disambiguationpartial multi-label
Youku Technology
Written by

Youku Technology

Discover top-tier entertainment technology here.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.