TransVCL: Attention‑Enhanced Video Copy Localization Network with Flexible Supervision
TransVCL introduces an end‑to‑end attention‑enhanced video copy localization network that leverages a custom Transformer, correlation‑Softmax similarity matrix, and temporal alignment module, combined with a semi‑supervised learning framework, achieving state‑of‑the‑art performance on VCSL and VCDB benchmarks.
Each day, UGC platforms generate massive video content, leading to significant economic gains but also raising video copyright infringement issues due to complex editing operations such as picture‑in‑picture, cropping, rotation, reversal, and splicing.
To address this, Ant Group’s Digital Technology AIoT team proposes an end‑to‑end video copy localization network, TransVCL, which incorporates a semi‑supervised learning strategy and has been accepted by AAAI 2023.
The task requires not only video‑level copy detection but also segment‑level localization of copied fragments. TransVCL directly optimizes from raw frame‑level features using three core components: a custom Transformer‑based feature‑enhancement module, a correlation‑Softmax similarity‑matrix generator, and a temporal‑alignment module for precise segment detection.
Unlike previous methods that manually construct similarity matrices, TransVCL learns them jointly, allowing self‑attention and cross‑attention layers to fuse long‑range temporal information, resulting in cleaner similarity patterns and higher discrimination.
A semi‑supervised framework is introduced to exploit abundant unlabeled or weakly labeled video data. A teacher model trained on limited fully‑labeled data generates pseudo‑labels for the remaining data, which are filtered by confidence thresholds and combined with supervised loss during joint training.
Extensive experiments on public datasets VCSL and VCDB demonstrate that TransVCL achieves the current state‑of‑the‑art F‑score, outperforming prior methods by a large margin. Tables and figures illustrate the attention‑enhanced similarity maps, overall architecture, and quantitative gains from the semi‑supervised strategy.
The work represents a significant advancement in AI‑driven copyright protection and has been recognized as a leading contribution in the AAAI 2023 conference.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.