AIWalker
Feb 11, 2025 · Artificial Intelligence
LLMDet: LLM‑Powered Open‑Vocabulary Detector Beats Grounding DINO
LLMDet introduces a novel training pipeline that leverages large language models to generate detailed image‑level captions and region‑level phrases, fine‑tunes an open‑vocabulary detector with the GroundingCap‑1M dataset, and achieves state‑of‑the‑art zero‑shot performance surpassing Grounding DINO across multiple benchmarks.
GroundingCapLLMDetlarge language models
0 likes · 20 min read
