Video Deduplication on Xianyu Using High‑Dimensional Vector Retrieval
The Xianyu platform combats video plagiarism by extracting key frames, converting them into 1024‑dimensional vectors, and using product quantization‑based high‑dimensional vector retrieval to achieve over 95% recall with ~100 ms latency and more than 1000 QPS, enabling scalable video, image, and product deduplication.
Background: Xianyu platform faces video plagiarism; the solution is to convert videos into vectors and use vector similarity for deduplication.
Challenges: billions of video frames, 1024‑dimensional per‑frame vectors, need >95% recall, latency ~100 ms, QPS >1000.
Implementation includes:
Video vectorization: extract key frames, compute local and global features via custom operators on TensorFlow Lite.
Similarity metrics: Hamming distance, cosine similarity, Euclidean distance, inner product.
Vector retrieval methods: tree‑based (KD‑tree), hashing (LSH), and vector quantization (PQ, hierarchical clustering). PQ was selected for large‑scale performance.
System architecture: client performs on‑device feature extraction; backend provides a unified vector access layer, log synchronization, offline data center, and a vector search engine (Alibaba BE integrated with FAISS).
Results: after deployment, the system handles >1000 QPS, latency ~100 ms per frame, and achieves >95% recall.
Conclusion: the approach demonstrates effective large‑scale video deduplication and can be extended to image and product deduplication.
Xianyu Technology
Official account of the Xianyu technology team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.