Artificial Intelligence 7 min read

2023 Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) Overview

The 2023 Chinese Continuous Visual Speech Recognition Challenge (CNVSRC), organized by Tsinghua University and partners, introduces the large-scale CN-CVS dataset, defines single- and multi-speaker lip‑reading tasks, provides baseline Conformer models, outlines registration, data access, evaluation metrics, and competition schedule.

DataFunTalk
DataFunTalk
DataFunTalk
2023 Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) Overview

Visual speech recognition (lip reading) aims to infer spoken content from mouth movements and has applications in public safety, assistive technologies, and video verification. Recent progress has been made on isolated words, but continuous large‑vocabulary Chinese recognition remains challenging due to limited data.

The CN‑CVS dataset released by Tsinghua University in 2023 is the first large‑scale open Chinese visual speech dataset, containing over 300 hours of video from 2 557 speakers across reading and speech scenarios. It serves as the training set for the CNVSRC closed‑set track.

CNVSRC 2023 defines two tasks: T1 – single‑speaker lip‑reading using the CN‑VSRC‑Single subset, and T2 – multi‑speaker lip‑reading using the CN‑VSRC‑Multi subset. Each task has a fixed track (restricted to provided data and public tools) and an open track (any additional resources except the test set).

Participants must register at http://cnceleb.org/competition, download the data, and submit results in the form of video‑ID followed by the transcribed text. Evaluation uses Character Error Rate (CER); each team may submit up to five times per track.

The organizers provide baseline systems based on a Conformer architecture. Reported CER on the development set is 48.57 % for single‑speaker and 58.77 % for multi‑speaker; on the evaluation set the CERs are 48.60 % and 58.37 % respectively. The baseline code is available at https://github.com/MKT-Dataoceanai/CNVSRC2023Baseline.

Key dates: registration and data release on 2023‑09‑20, test set release on 2023‑10‑10, submission opening on 2023‑11‑01, final submission deadline 2023‑12‑01 23:59, and results announcement at NCMMSC 2023 Workshop on 2023‑12‑09.

The program committee includes members from Tsinghua University, Beijing University of Posts and Telecommunications, HaiTian RuiSheng, and Voice Home.

AIdatasetChallengeconformerlip readingvisual speech recognition
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.