How Ant Group Dominated the 2025 DCASE Audio Question Answering Challenge

The article details the 2025 DCASE Audio Question Answering (AQA) track, outlines its technical challenges, describes Ant Group's three‑stage data, model, and training pipeline, presents performance gains of their Qwen2‑Audio‑R1‑8B and Kimi‑Audio‑SFT‑12B models, and outlines future research directions.

AntTech
AntTech
AntTech
How Ant Group Dominated the 2025 DCASE Audio Question Answering Challenge

AQA Track Introduction

The 2025 DCASE Challenge added a fifth track, Audio Question Answering (AQA), focusing on interactive audio understanding where models must answer questions about diverse audio inputs.

AQA task illustration
AQA task illustration
model trainingAudio Question AnsweringDCASE
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.