Kuaishou Tech
Jan 5, 2022 · Artificial Intelligence
How a New Bilingual Video Text Dataset and Transformer Spotter Advance Video OCR
This article reviews the NeurIPS 2021 paper introducing BOVText, a large‑scale bilingual video‑text dataset with over 2,000 videos and 1.75 million frames, and describes its transformer‑based end‑to‑end video text spotter that integrates EAST encoding into DETR, covering dataset collection, annotation, architecture, and experimental results.
BOVTextDETRbilingual dataset
0 likes · 12 min read
