How Kuaishou Boosts Short‑Video Quality with AI‑Driven Encoding and Enhancement
This article explores Kuaishou’s server‑side video pipeline, detailing why videos are re‑encoded, how AI‑based enhancement (De‑Art, SR, Deblur) and perceptual encoding techniques (CAQ, CARDO) improve subjective quality, reduce bitrate, and lower processing costs, while presenting algorithmic logic and real‑world results.
Why Server‑Side Re‑Encoding?
When users upload videos, mobile devices have limited processing power, resulting in large raw files. Delivering these directly would cause buffering, poor playback experience, and higher bandwidth consumption, so the server re‑encodes videos to a more efficient format.
Why Video Enhancement?
Uploaded videos vary in quality; many have been compressed multiple times or captured with low‑end equipment, leading to artifacts, low resolution, or blur. Enhancing these videos on the server restores visual fidelity and ensures a consistent user experience.
Validating Quality Improvements
Subjective blind tests are conducted by having multiple reviewers score videos before and after optimization; a higher selection rate for the enhanced version without noticeable bad cases confirms algorithm effectiveness.
AI Video Enhancement Pipeline
Kuaishou’s self‑developed analysis module Capella assesses uploaded video quality and decides whether to trigger AI enhancement. The Atlas‑AI trigger strategy considers video resolution, source origin, and a no‑reference quality score (compression level, blur, noise).
Key AI Enhancement Algorithms
The AI suite targets low‑quality videos caused by compression distortion, low resolution, and blur. Specific algorithms are:
De‑Art – removes compression artifacts
SR – super‑resolution for up‑scaling
Deblur – restores sharpness
After deployment, these algorithms yielded noticeable gains in view count and watch time with a favorable ROI.
Left: Deblur before, Right: Deblur after.
Encoding Subjective Quality Optimization
Kuaishou’s custom encoder incorporates two perceptual algorithms: Content‑Adaptive Quantization (CAQ) and Content‑Adaptive Rate‑Distortion Optimization (CARDO) . These improve subjective quality at the same bitrate.
CAQ Logic
CAQ analyzes each frame to compute a JND (Just‑Noticeable‑Difference) factor based on three components: average block brightness, texture strength (edge intensity and gradient), and texture type (smooth, regular, irregular). The JND factor determines a QP offset for each block.
To avoid large bitrate fluctuations, a model predicts the rate impact of the QP offset and adjusts it, balancing subjective gains with objective bitrate stability.
CARDO Logic
CARDO adds two enhancements: perceptual quantization that preserves high‑sensitivity regions, and edge‑based gradient‑difference rate‑distortion optimization, which incorporates edge gradient error into the cost function and adapts the λ parameter.
Left: before optimization, Right: after optimization.
Results and Impact
Post‑optimization analysis shows that high‑complexity regions (e.g., crowns) consume excessive bitrate without perceptual benefit. Redistributing bitrate to faces and clothing improves overall subjective quality. Visualizations of CTU‑level bitrate changes illustrate the redistribution.
Kuaishou’s video quality algorithms have been iteratively refined across massive video scenarios, consistently delivering user‑centric visual improvements.
Kuaishou Audio & Video Technology
Explore the stories behind Kuaishou's audio and video technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.