Big Data 6 min read

Video Copyright Detection Solution Using SE-ResNeXt and Faiss in the 2019 CCF Big Data & Computational Intelligence Competition

The iQiyi team “都挺好” tackled the 2019 CCF Video Copyright Detection contest by extracting frame‑level features with SE‑ResNeXt, indexing them with Faiss, aligning temporally via a critical‑path method, and refining copy boundaries using SIFT re‑matching and a sliding‑window approach, ultimately achieving an F1 score of 0.9678 after three iterative stages of model selection, cascade detection, and feature fusion.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
Video Copyright Detection Solution Using SE-ResNeXt and Faiss in the 2019 CCF Big Data & Computational Intelligence Competition

The team "都挺好" from iQiyi participated in the "2019 CCF Big Data & Computational Intelligence Competition – Video Copyright Detection" track. They employed a convolutional neural network (SE‑ResNeXt) to extract feature vectors from video frames and used Faiss for nearest‑neighbor search to retrieve the top‑k similar frames. After temporal alignment with the critical path method, they refined copy boundaries using traditional feature re‑matching and a sliding‑window matching approach, achieving an F1 score of 0.9678 on the B‑list of the final round.

Task Interpretation

The team first investigated the competition task, dividing the work into three stages: (1) system construction and module experimentation, (2) improving feature representation, and (3) optimizing temporal alignment and feature extraction for finer time boundaries.

Solution Sharing

Stage 1 : Various CNN architectures (VGG, DenseNet, SE‑ResNet, SE‑ResNeXt) and traditional features (SIFT, hashing) were tested. SE‑ResNeXt yielded the best feature performance. For indexing, the open‑source Faiss library was adopted, and a network‑flow‑based method was used for temporal alignment, forming an initial system framework.

Stage 2 : Analysis of Stage 1 results identified difficult query‑refer video pairs. A cascade detector, resembling a Siamese network, was trained on these pairs to obtain higher‑quality features. During testing on sets A and B, the cascade detector filtered high‑error queries before applying the trained model, further boosting the F1 score.

Stage 3 : Three strategies were applied: (1) feature fusion from different CNNs to enhance detection accuracy, (2) SIFT‑based boundary refinement to meet the strict 3‑second limit in the final round, and (3) a sliding‑window time‑boundary relocation method to address missing match segments, leading to additional accuracy gains.

Gains and Reflections

The competition was a memorable experience for the team, highlighting the importance of self‑motivation, mutual encouragement, rigorous code review, and effective communication. A critical bug discovered during the final hours—duplicate frame‑to‑second conversions by two members—caused a sudden drop in true positives, which was quickly fixed, instantly improving the B‑list F1 score.

Team Introduction and Remarks

The "都挺好" team consists of two members: Bu Qi (team leader, iQiyi) and Wang Hongyu (member, Beihang University). They emphasize that overcoming ranking pressure and algorithmic stagnation requires persistent teamwork, brainstorming, and continuous coding. The team plans to participate in future related contests and hopes to apply their solutions to real‑world problems using online datasets.

deep learningfaissCCF competitionSE-ResNeXtVideo Copyright Detectionvideo similarity
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.