AIWalker
AIWalker
Mar 16, 2026 · Artificial Intelligence

DETR Drops Hungarian Matching: Double Training Speed, +4.2 AP on Large Objects

Beyond-Hungarian replaces the costly Hungarian assignment in DETR with a differentiable, query‑free matching scheme that halves training latency, boosts large‑object AP by 4.2 points, and introduces a GT‑Probe module and dual‑loss framework, while detailing trade‑offs, ablations, and future challenges.

DETRGT-ProbeHungarian matching
0 likes · 14 min read
DETR Drops Hungarian Matching: Double Training Speed, +4.2 AP on Large Objects
AIWalker
AIWalker
Mar 11, 2026 · Artificial Intelligence

Why 90% of DETR Queries Stay Idle and How PaQ‑DETR Boosts mAP by 4.2%

The article dissects the query‑activation imbalance in DETR‑based detectors, explains PaQ‑DETR’s pattern‑sharing and quality‑aware assignment mechanisms, and shows how these jointly raise detection mAP by up to 4.2% on COCO with less than 5% extra FLOPs.

DETRPaQ-DETRobject detection
0 likes · 15 min read
Why 90% of DETR Queries Stay Idle and How PaQ‑DETR Boosts mAP by 4.2%
AIWalker
AIWalker
Mar 9, 2026 · Artificial Intelligence

How EFSI‑DETR Achieves 188 FPS and Boosts Small‑Object Detection Accuracy by 5.8%

The article dissects EFSI‑DETR, a UAV small‑object detector that combines simulated frequency processing with dynamic semantic enhancement to overcome pixel scarcity, static fusion, and ignored frequency cues, delivering 188 FPS and a 5.8% APₛ gain on VisDrone while remaining lightweight.

DETRReal-time InferenceUAV vision
0 likes · 16 min read
How EFSI‑DETR Achieves 188 FPS and Boosts Small‑Object Detection Accuracy by 5.8%
Bilibili Tech
Bilibili Tech
Nov 26, 2024 · Artificial Intelligence

DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training

DNTextSpotter is an arbitrary-shaped scene text spotting model using the DETR architecture with an improved denoising training scheme that adds noise to Bézier control points and employs mask‑sliding character queries, achieving significant benchmark gains without extra inference cost and enabling robust text recognition in challenging environments.

DETRarbitrary-shaped textdenoising training
0 likes · 13 min read
DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training
Kuaishou Tech
Kuaishou Tech
Jan 5, 2022 · Artificial Intelligence

How a New Bilingual Video Text Dataset and Transformer Spotter Advance Video OCR

This article reviews the NeurIPS 2021 paper introducing BOVText, a large‑scale bilingual video‑text dataset with over 2,000 videos and 1.75 million frames, and describes its transformer‑based end‑to‑end video text spotter that integrates EAST encoding into DETR, covering dataset collection, annotation, architecture, and experimental results.

BOVTextDETRTransformer
0 likes · 12 min read
How a New Bilingual Video Text Dataset and Transformer Spotter Advance Video OCR
TiPaiPai Technical Team
TiPaiPai Technical Team
Jun 11, 2021 · Artificial Intelligence

How Transformers Revolutionize Vision: From DETR to GCNet

This article explores how Transformer architectures, originally designed for NLP, are adapted for visual tasks, detailing pioneering models such as DETR, CBAM, NLNet, SENet, and GCNet, and explains their structures, attention mechanisms, advantages, and experimental findings for image processing.

Attention MechanismsDETRSelf-attention
0 likes · 13 min read
How Transformers Revolutionize Vision: From DETR to GCNet