How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge
Tencent TEG Shannon Lab won the NTIRE 2025 UGC Video Enhancement competition with a progressive training framework that combines adaptive color enhancement, high‑speed denoising, and temporal stability under bitrate constraints, achieving top subjective scores, significant inference speed‑ups, and successful INT8 quantization for real‑time deployment.
1. Competition Overview
CVPR NTIRE (New Trends in Image Restoration and Enhancement) is a leading international competition. In NTIRE 2025 UGC Video Enhancement, teams from Tencent, ByteDance, Alibaba and other companies participated. Tencent TEG Shannon Lab won the championship with a self‑developed video AI quality‑enhancement algorithm, which has been deployed to improve video clarity in Tencent services.
2. Algorithm Solution
2.1 Overall Framework
The team proposed a progressive‑training video‑enhancement framework that decomposes the task into three sub‑problems: color enhancement (Stage 1), denoising (Stage 2), and temporal stability with bitrate constraint (Stage 3). Lightweight expert models are plug‑and‑play and can be used independently.
2.2 Adaptive Color Enhancement
Stage 1 uses a CLUT network based on MobileNetV3 to predict a 64×64×64 lookup table for content‑aware color boosting. The module is plug‑and‑play, intensity‑controllable, and does not depend on the rest of the pipeline.
2.3 High‑Speed Denoising
Stage 2 replaces the standard U‑Net with a RepVGG‑based denoising network, achieving ~300 FPS on an NVIDIA TITAN RTX while removing sensor noise and compression artifacts.
2.4 Temporal Stability & Redundancy Removal
Stage 3 extracts frame features with RepVGG, aligns them using RAFT optical flow, and applies a texture‑enhancement network. A joint loss combines AI‑encoder bitrate constraint and temporal coherence loss, preserving quality at 3000 kbps.
3. Competition Results
The proposed method achieved the highest subjective scores on both public and private test sets, with an 81 % win rate over the raw input. It secured the overall champion and also won two additional titles in the real‑time track.
4. Hardware‑Aware Inference Optimization
To meet real‑time requirements, the team performed assembly‑level operator optimization, reducing memory‑bandwidth consumption by 50 %. Customized layer‑fusion for pooling/resize, REG BN Expand to reuse registers, and space‑fusion implicit GEMM were introduced. JIT compilation allowed automatic selection of optimal kernel parameters per hardware.
These optimizations yielded a 2.51× speed‑up over CUTLASS and 3.32× over cuDNN, with overall inference time reduced from 5.370 ms to 4.471 ms (≈19.8 % improvement).
5. Quantization & Model Deployment
INT8 quantization with knowledge‑distillation, local adaptive distillation, and hierarchical feature distillation preserved 99 % of the full‑precision accuracy while doubling throughput. Additional techniques such as INT8 data‑flow alignment and REG BN Expand further accelerated the model.
6. Future Directions
The team plans to explore diffusion‑based generative enhancement, integrating Control‑Net, LoRA, and large‑scale data to improve stability and fidelity of video restoration.
Paper: NTIRE 2025 UGC Video Enhancement Challenge
Competition: Codabench NTIRE 2025
Tencent Architect
We share insights on storage, computing, networking and explore leading industry technologies together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
