AntTech
Feb 5, 2026 · Artificial Intelligence
How Triple Alignment and Rationale Generation Supercharge Knowledge‑Based VQA
This paper presents a lightweight, high‑efficiency framework called Triple Alignment with Rationale Generation (TAG) that transforms knowledge‑based visual question answering into a contrastive learning task, dramatically reducing trainable parameters while achieving state‑of‑the‑art performance on major KVQA benchmarks.
CLIPLightweight ModelVQA
0 likes · 7 min read
