Artificial Intelligence 5 min read

How the L3DAS22 Challenge Advances Deep Learning for 3D Audio Signal Processing

The inaugural L3DAS22 competition, co‑hosted by Kuaishou Audio and Sapienza University, gathered nearly 50 academic and industry teams to benchmark deep‑learning‑based 3D audio signal processing, featuring tasks on multi‑channel speech enhancement and source detection, with results presented at ICASSP 2022.

Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
How the L3DAS22 Challenge Advances Deep Learning for 3D Audio Signal Processing

Overview

Recently, Kuaishou Audio Technology Department and Italy's Sapienza University of Rome announced the results of the deep 3D audio signal processing challenge L3DAS22 (Machine Learning for 3D Audio Signal Processing) held at the international audio conference ICASSP 2022. Nearly 50 academic and industrial teams from around the world participated, and the outcomes will be shared in a Special Session at the ICASSP 2022 conference in May.

Challenge Tasks

Task 1: Multi‑channel 3D speech enhancement, focusing on real‑time speech enhancement in office scenarios.

Task 2: Specific sound‑source detection and localization in real‑world scenes, targeting applications such as autonomous driving and surveillance.

Results

After nearly three months of intense competition, the results were released. In Task 1, Carnegie Mellon University, Baidu and Tencent ranked the top three; in Task 2, the Chinese Academy of Sciences Institute of Acoustics, Chongqing University of Posts and Telecommunications and Singapore’s ForteMedia took the first three places. Kuaishou offered substantial prizes to the top two teams of each task.

The challenge dataset simulated more than 40,000 3D environments and provided two sets of Ambisonics‑format 3D recordings. Submissions were evaluated using Short‑Time Objective Intelligibility (STOI) and Word Error Rate (WER) metrics.

Task 1: Speech Enhancement Rankings

Task 2: Source Detection and Localization Rankings

Impact and Team Highlights

Kuaishou Audio’s team, composed of top experts, continuously explores deep‑learning‑based audio signal processing. Their work spans real‑time speech communication, audio effects and post‑processing, audio content understanding, coding, and hardware. The team has published papers in top venues such as IEEE ICASSP, Interspeech, and ACM/IEEE TALSP, and has repeatedly won audio‑related challenge championships.

deep learningICASSPsignal processing3D audioaudio challenge
Kuaishou Audio & Video Technology
Written by

Kuaishou Audio & Video Technology

Explore the stories behind Kuaishou's audio and video technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.