Tagged articles
1 articles
Page 1 of 1
Weekly Large Model Application
Weekly Large Model Application
Mar 30, 2026 · Artificial Intelligence

Inside Kimi-Audio: A Unified Large Audio Model Covering ASR, AQA, TTS and More

Kimi-Audio, a general‑purpose audio foundation model from Moonshot AI, integrates ASR, audio QA, automatic audio captioning, emotion classification and end‑to‑end speech dialogue within a single framework, detailing its mixed‑audio input, MiMo‑Transformer core, efficient synthesis pipeline, architectural strengths, limitations, and suitable application scenarios.

ASRAudio LLMBigVGAN
0 likes · 9 min read
Inside Kimi-Audio: A Unified Large Audio Model Covering ASR, AQA, TTS and More