Artificial Intelligence 18 min read

How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge

Tencent TEG Shannon Lab won the NTIRE 2025 UGC Video Enhancement competition with a progressive training framework that combines adaptive color enhancement, high‑speed denoising, and temporal stability under bitrate constraints, achieving top subjective scores, significant inference speed‑ups, and successful INT8 quantization for real‑time deployment.

Tencent Architect

Jul 2, 2025

How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge

1. Competition Overview

CVPR NTIRE (New Trends in Image Restoration and Enhancement) is a leading international competition. In NTIRE 2025 UGC Video Enhancement, teams from Tencent, ByteDance, Alibaba and other companies participated. Tencent TEG Shannon Lab won the championship with a self‑developed video AI quality‑enhancement algorithm, which has been deployed to improve video clarity in Tencent services.

2. Algorithm Solution

2.1 Overall Framework

The team proposed a progressive‑training video‑enhancement framework that decomposes the task into three sub‑problems: color enhancement (Stage 1), denoising (Stage 2), and temporal stability with bitrate constraint (Stage 3). Lightweight expert models are plug‑and‑play and can be used independently.

2.2 Adaptive Color Enhancement

Stage 1 uses a CLUT network based on MobileNetV3 to predict a 64×64×64 lookup table for content‑aware color boosting. The module is plug‑and‑play, intensity‑controllable, and does not depend on the rest of the pipeline.

2.3 High‑Speed Denoising

Stage 2 replaces the standard U‑Net with a RepVGG‑based denoising network, achieving ~300 FPS on an NVIDIA TITAN RTX while removing sensor noise and compression artifacts.

2.4 Temporal Stability & Redundancy Removal

Stage 3 extracts frame features with RepVGG, aligns them using RAFT optical flow, and applies a texture‑enhancement network. A joint loss combines AI‑encoder bitrate constraint and temporal coherence loss, preserving quality at 3000 kbps.

3. Competition Results

The proposed method achieved the highest subjective scores on both public and private test sets, with an 81 % win rate over the raw input. It secured the overall champion and also won two additional titles in the real‑time track.

4. Hardware‑Aware Inference Optimization

To meet real‑time requirements, the team performed assembly‑level operator optimization, reducing memory‑bandwidth consumption by 50 %. Customized layer‑fusion for pooling/resize, REG BN Expand to reuse registers, and space‑fusion implicit GEMM were introduced. JIT compilation allowed automatic selection of optimal kernel parameters per hardware.

These optimizations yielded a 2.51× speed‑up over CUTLASS and 3.32× over cuDNN, with overall inference time reduced from 5.370 ms to 4.471 ms (≈19.8 % improvement).

5. Quantization & Model Deployment

INT8 quantization with knowledge‑distillation, local adaptive distillation, and hierarchical feature distillation preserved 99 % of the full‑precision accuracy while doubling throughput. Additional techniques such as INT8 data‑flow alignment and REG BN Expand further accelerated the model.

6. Future Directions

The team plans to explore diffusion‑based generative enhancement, integrating Control‑Net, LoRA, and large‑scale data to improve stability and fidelity of video restoration.

Paper: NTIRE 2025 UGC Video Enhancement Challenge

Competition: Codabench NTIRE 2025