ICDAR 2023 BDVT-QA Competition: Born Digital Video Text Question Answering
The ICDAR 2023 BDVT-QA competition, organized by Alibaba DAMO Academy, introduces a novel dataset of 1,000 born‑digital video clips for end‑to‑end video text recognition and video text question answering, offering cash prizes, detailed dataset access, and a lineup of leading academic and industry experts.
International Conference on Document Analysis and Recognition (ICDAR) is the premier conference in document analysis and recognition, gathering top researchers and industry experts every two years to discuss cutting‑edge technologies.
ICDAR 2023 BDVT‑QA (Competition for Born Digital Video Text Question Answering) is hosted by Alibaba DAMO Academy’s Text‑Vision Understanding team, focusing on the frontier of video text technology with both academic and industrial significance.
The competition features two tracks: (1) End‑to‑end video text recognition, emphasizing fusion and deduplication of video text; (2) Video text question answering, the first industry task of its kind, requiring multi‑frame text understanding.
Prize awards for each track are $2000 for first place, $1000 for second, and $500 for third.
The released dataset contains 1,000 video clips (8–60 seconds) sourced from the public internet, covering product demos, tool instructions, and film dialogues, primarily in English with some Chinese. Text annotations are polygonal, frame‑level, and include text IDs for the same object across frames; each video provides two QA pairs with reference answers.
Dataset can be downloaded from the ModelScope community (https://modelscope.cn/), searching for keywords such as "video text QA".
Organizers include senior algorithm experts and professors from Alibaba, Nanjing University, Huazhong University of Science and Technology, and the Chinese Academy of Sciences.
Sponsorship is provided by ModelScope, a leading Chinese AI model open‑source community that aggregates state‑of‑the‑art models and datasets for AI research.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.