Tagged articles
1 articles
Page 1 of 1
Kuaishou Tech
Kuaishou Tech
Apr 17, 2024 · Artificial Intelligence

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

The paper presented at AAAI introduces the EERCF method, a coarse‑to‑fine visual representation and two‑stage recall‑then‑rerank strategy that dramatically reduces cross‑modal matching FLOPs while preserving state‑of‑the‑art retrieval performance on multiple video benchmarks.

AIMultimodal Learningcoarse-to-fine representation
0 likes · 8 min read
Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning