Artificial Intelligence 10 min read

How Tencent’s TVQA‑C Algorithm Won the ECCV 2024 Video Quality Challenge

Tencent’s TVQA‑C video quality assessment algorithm clinched first place in the ECCV 2024 AIM Workshop compression video quality track, showcasing a novel model architecture, group‑aware training strategy, and specialized loss functions that will soon power Tencent Cloud’s media processing services.

Tencent Architect
Tencent Architect
Tencent Architect
How Tencent’s TVQA‑C Algorithm Won the ECCV 2024 Video Quality Challenge

ECCV 2024 AIM Workshop Results

Recently, the ECCV 2024 AIM Workshop announced that Tencent’s TVQA‑C video quality assessment algorithm won the championship in the compression video quality evaluation track. The technology will be integrated into Tencent Cloud Media Processing (MPS) to enhance end‑to‑end media quality monitoring and analysis for enterprise users.

Why Video Quality Assessment Matters

Video accounts for the largest share of global internet traffic, driving a strong demand for efficient compression codecs. The performance of a codec is judged by quality assessment metrics, which include objective and subjective scores. The MSU World Video Codec Competition, for example, combines objective metrics with a subjective voting track to obtain user‑perceived quality scores, though this approach requires substantial human resources.

Accurate video quality assessment algorithms therefore help codec developers iterate faster and provide direct feedback on how encoding settings affect viewer experience across various scenarios.

About the AIM Workshop Competition

The AIM Workshop, part of the European Conference on Computer Vision (ECCV) 2024, focuses on image manipulation. The compression video quality evaluation competition was co‑organized by Lomonosov Moscow State University, Yandex Research, ISP RAS Research Center for Trusted AI, MSU Institute for AI, and Julius Maximilian University of Würzburg. The dataset consists of historically compressed videos from the MSU competition, manually annotated for subjective quality.

Participants were evaluated on two aspects: monotonicity (using SROCC and KROCC) and accuracy (using PLCC). The final ranking score was the average of these three metrics. Teams trained on the provided training set, submitted validation results for preliminary assessment, and finally submitted test‑set predictions for the official ranking.

Tencent TVQA‑C Takes First Place

Tencent’s TVQA‑C achieved the highest overall score, narrowly surpassing the runner‑up on the SROCC metric (by 0.0002) and leading by 0.0092 on KROCC and 0.0063 on PLCC. The total score was 0.0051 points higher than the second‑place team, securing the championship.

Model Architecture

The TVQA‑C model uses the HVS‑5M backbone to capture spatial and temporal artifacts caused by compression. Features are extracted with a large‑scale Q‑Align network, fused via a dedicated feature‑fusion module, and finally passed through a fully‑connected layer to produce a quality score.

Training Strategy

Because the subjective scores are obtained from group‑wise voting, the training data were split into 57 groups matching the voting groups. Each training batch contains samples from a single group, ensuring comparability of scores within the batch. Data augmentation includes random selection of eight videos per batch and shuffling of their order.

The loss function combines SROCC loss, PLCC loss, and a pairwise‑ranking loss to improve KROCC. The final loss is:

Loss = λ1·L_SROCC + λ2·L_PLCC + λ3·L_pairwise‑ranking

Training was performed on a single A100 GPU using the AdamW optimizer with a cosine‑annealing learning rate schedule (5e‑4 → 5e‑6). To mitigate instability introduced by the pairwise‑ranking loss, an exponential moving average (EMA) of model parameters was applied.

Future Outlook

Tencent Cloud will embed the TVQA‑C algorithm into its Media Processing (MPS) service, providing developers with a powerful tool for media quality inspection across offline, live‑streaming, and other scenarios. Combined with MPS’s six core capabilities—format diagnosis, content inspection, no‑reference scoring, high orchestration, flexible deployment, and customizability—the solution aims to deliver superior QoS and QoE for end users.

AIdeep learningvideo quality assessmentTencentcompressionECCV 2024
Tencent Architect
Written by

Tencent Architect

We share technical insights on storage, computing, and access, and explore industry-leading product technologies together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.