Artificial Intelligence 12 min read

Video Deduplication on Xianyu Using High‑Dimensional Vector Retrieval

The Xianyu platform combats video plagiarism by extracting key frames, converting them into 1024‑dimensional vectors, and using product quantization‑based high‑dimensional vector retrieval to achieve over 95% recall with ~100 ms latency and more than 1000 QPS, enabling scalable video, image, and product deduplication.

Xianyu Technology

Sep 7, 2018

Video Deduplication on Xianyu Using High‑Dimensional Vector Retrieval

Background: Xianyu platform faces video plagiarism; the solution is to convert videos into vectors and use vector similarity for deduplication.

Challenges: billions of video frames, 1024‑dimensional per‑frame vectors, need >95% recall, latency ~100 ms, QPS >1000.

Implementation includes:

Video vectorization: extract key frames, compute local and global features via custom operators on TensorFlow Lite.

Similarity metrics: Hamming distance, cosine similarity, Euclidean distance, inner product.

Vector retrieval methods: tree‑based (KD‑tree), hashing (LSH), and vector quantization (PQ, hierarchical clustering). PQ was selected for large‑scale performance.

System architecture: client performs on‑device feature extraction; backend provides a unified vector access layer, log synchronization, offline data center, and a vector search engine (Alibaba BE integrated with FAISS).

Results: after deployment, the system handles >1000 QPS, latency ~100 ms per frame, and achieves >95% recall.

Conclusion: the approach demonstrates effective large‑scale video deduplication and can be extended to image and product deduplication.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

FAISS Vector Retrieval product recommendation high-dimensional vectors PQ video deduplication

Written by

Xianyu Technology

Official account of the Xianyu technology team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.