Self‑Supervised Video Copy Localization with Regional Token Representation
The article presents a self‑supervised framework that uses a regional token structure within a Vision Transformer to accurately locate video plagiarism segments, dramatically reducing annotation costs and achieving state‑of‑the‑art performance without manual labeling, while also highlighting its real‑world deployment for copyright protection.
With the rapid growth of video content, plagiarism has become a serious issue, prompting the need for accurate and efficient detection methods to protect creators' rights.
Ant Group's algorithm team recently had their paper "Self‑Supervised Video Copy Localization with Regional Token Representation" accepted at ECCV 2024, focusing on locating copied video segments and determining their start‑and‑end times.
Existing localization algorithms struggle with partial‑region copies such as picture‑in‑picture and rely heavily on costly manual annotations; annotating 90,000 video pairs can take half a year for a small team.
The proposed solution modifies the Vision Transformer architecture by introducing a novel "regional token" structure that captures local visual information, enabling the model to detect fine‑grained copying tactics like screen‑recording and picture‑in‑picture with negligible extra computation.
To eliminate the dependence on manual labeling, the team automatically generates synthetic video pairs that simulate various copying strategies, effectively creating a large, diverse training set.
Experimental results show that, without any human‑annotated data, the method surpasses current state‑of‑the‑art approaches in both accuracy and completeness, and can be further improved with as little as 1% of labeled data.
The researchers also released the VCSL (Video Copy Segment Localization) dataset in 2022, which remains the largest publicly available video plagiarism localization dataset and was featured at CVPR 2022.
Beyond research, the technology has been deployed in Ant Group's digital copyright service platform "Quetz", offering end‑to‑end protection for audio‑video and image creators, including registration, search, and full‑network infringement detection.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.