How AI Restores Blurry Faces: Inside Kuaishou’s Y‑Tech High‑Definition Portrait Project
Image clarity impacts daily life, from personal memories to security, and Kuaishou’s Y‑Tech team tackles degradation by constructing paired low‑high quality datasets and a style‑based AI model that leverages facial masks to restore high‑definition portraits, preserving identity while enhancing detail.
Background
Image clarity affects many aspects of life, influencing viewing experience, the preservation of precious moments, historical research, child safety, and security investigations. Degraded images result from factors such as capture technique, equipment, imaging systems, and storage/transmission methods, leading to blur, noise, compression, etc.
Project Goal
The aim is to mitigate or eliminate quality loss in portrait images during acquisition and transmission, restoring their true appearance using AI to overcome external limitations, preserve valuable memories, and enhance security.
Problem Analysis and Data Construction
Image degradation can be modeled as a combination of blur, down‑sampling, noise, and compression. The team identified common degradation types and constructed paired high‑quality and low‑quality data by applying random combinations and levels of these degradations.
Noise: Gaussian, shot, impulse
Blur: Gaussian, motion, defocus, resize
Other: JPEG compression, pixelation
By varying proportions, degradation levels, and combinations, realistic paired datasets are generated to simulate real‑world distributions.
Technical Solution
The overall pipeline processes a user‑captured image as follows: face detection and landmark localization, face cropping and alignment, face parsing to obtain a mask, feeding the aligned face and mask into a high‑definition model, and finally compositing the restored face back into the original background.
Model Design
Leveraging the strong structural prior of faces, the face parsing mask is incorporated as a prior in a style‑based generative framework (image→style→image). At each level, a style map derived from the input image and mask modulates the generator’s feature maps, enabling fine‑grained control from coarse to fine details.
Training
The model is optimized using L1 loss, perceptual loss, and adversarial loss, while a data pool introduces diverse degradation types within each batch. Fine‑tuning further reduces dependence on the face parsing model, allowing good results even with poor input quality or inaccurate masks. Background regions also receive denoising, deblurring, and JPEG artifact removal before final compositing.
Conclusion and Outlook
Kuaishou’s Y‑Tech portrait‑HD project restores high‑definition facial details while preserving identity, delivering impressive visual quality and performance now available in the “One Sweet Camera” product. The team will continue to refine algorithms and leverage broader computer‑vision expertise to meet evolving user needs.
References
A Style‑Based Generator Architecture for Generative Adversarial Networks, 2019
Analyzing and Improving the Image Quality of StyleGAN, 2020
stylegan‑encoder, https://github.com/pbaylies/stylegan-encoder, 2019
Image2stylegan: How to embed images into the StyleGAN latent space, 2020
Encoding in Style: a StyleGAN Encoder for Image‑to‑Image Translation, 2021
Additive Angular Margin Loss for Deep Face Recognition, 2018
Kuaishou Large Model
Official Kuaishou Account
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.