Tag

text-to-video retrieval

0 views collected around this technical thread.

Kuaishou Tech
Kuaishou Tech
Apr 17, 2024 · Artificial Intelligence

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

The paper presented at AAAI introduces the EERCF method, a coarse‑to‑fine visual representation and two‑stage recall‑then‑rerank strategy that dramatically reduces cross‑modal matching FLOPs while preserving state‑of‑the‑art retrieval performance on multiple video benchmarks.

AIcoarse-to-fine representationefficiency
0 likes · 8 min read
Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning