Tagged articles
4 articles
Page 1 of 1
Bilibili Tech
Bilibili Tech
Oct 8, 2024 · Artificial Intelligence

ICDAR 2024 Historical Map Text Recognition Competition: DNTextSpotter Methodology and Results

The ICDAR 2024 Historical Map Text Recognition competition was won by Bilibili’s DNTextSpotter, a Transformer‑based model built on DeepSolo and ViTAE‑v2 that uses deformable self‑attention, dual‑query decoding and denoising training, combined with mixed‑vocabulary fine‑tuning, advanced loss functions and strict PDQ/PWQ/PCQ metrics to achieve state‑of‑the‑art dense, rotated, arbitrary‑shaped text detection and recognition on historical maps and real‑world multimedia.

DNTextSpotterDeep LearningEvaluation Metrics
0 likes · 17 min read
ICDAR 2024 Historical Map Text Recognition Competition: DNTextSpotter Methodology and Results
DataFunTalk
DataFunTalk
Feb 10, 2023 · Artificial Intelligence

ICDAR 2023 BDVT-QA Competition: Born Digital Video Text Question Answering

The ICDAR 2023 BDVT-QA competition, organized by Alibaba DAMO Academy, introduces a novel dataset of 1,000 born‑digital video clips for end‑to‑end video text recognition and video text question answering, offering cash prizes, detailed dataset access, and a lineup of leading academic and industry experts.

AIDatasetICDAR
0 likes · 5 min read
ICDAR 2023 BDVT-QA Competition: Born Digital Video Text Question Answering
Kuaishou Tech
Kuaishou Tech
Dec 26, 2022 · Artificial Intelligence

ICDAR 2023-DSText Video Text Reading Competition Overview

The ICDAR 2023-DSText competition, launching on February 15, 2023, focuses on dense and small text detection and recognition in video, providing a YouTube‑sourced dataset of 100 videos, two challenge tasks, a detailed timeline, eligibility rules, and a list of international sponsoring institutions.

Computer VisionDatasetICDAR
0 likes · 6 min read
ICDAR 2023-DSText Video Text Reading Competition Overview
Meituan Technology Team
Meituan Technology Team
Sep 26, 2019 · Artificial Intelligence

Efficient Scene Text Detection Framework with Feature Pyramid and Expanded High-Level Feature Maps

The paper presents an efficient scene‑text detector that expands high‑level SSD feature maps and integrates a feature‑pyramid network, using direction‑aware segment‑and‑link predictions to reconstruct arbitrarily long, rotated text, achieving higher recall and precision with real‑time speed and outperforming recent methods on ICDAR benchmarks and a menu‑recognition test.

Computer VisionDeep LearningICDAR
0 likes · 12 min read
Efficient Scene Text Detection Framework with Feature Pyramid and Expanded High-Level Feature Maps