Advances in Video Multimodal Retrieval: Video‑Text Semantic Search and Video‑Video Same‑Source Search
This article presents Ant Group's multimodal research on video retrieval, detailing video‑text semantic search and video‑video same‑source search, introducing a large Chinese pre‑training dataset, novel pre‑training, hard‑sample mining, fine‑grained modeling techniques, and an efficient end‑to‑end copyright detection framework.