Artificial Intelligence 12 min read

How ResULIC Achieves Ultra‑Low‑Rate Image Compression with Semantic Residual Coding and Diffusion

The paper introduces ResULIC, a residual‑guided ultra‑low‑bitrate image compression framework that combines semantic residual coding, a compression‑aware diffusion model, and perceptual fidelity optimization to dramatically improve visual quality and outperform prior diffusion‑based methods on standard benchmarks.

Kuaishou Tech

Jul 9, 2025

How ResULIC Achieves Ultra‑Low‑Rate Image Compression with Semantic Residual Coding and Diffusion

Background

Learning‑based image compression has surpassed traditional codecs such as JPEG2000 and VVC in both objective and subjective metrics, but at extremely low bitrates it suffers from over‑smooth textures and loss of structural details. Recent diffusion models offer a promising alternative, yet existing methods still exhibit noticeable gaps in fidelity and consistency.

Method

The paper proposes ResULIC (Residual‑guided Ultra Low‑rate Image Compression), which consists of three core components:

Feature compressor : maps the image into a latent space.

Semantic Residual Coding : extracts semantic residuals by comparing the decoded image with the original, feeds both captions into a large language model to obtain concise semantic descriptions, and encodes these residuals as additional bits.

Compression‑aware Diffusion Model : conditions a diffusion process on the compressed latent representation and the semantic residual, aligning compression ratio with diffusion timesteps to achieve high‑fidelity reconstruction at ultra‑low bitrates.

Perceptual Fidelity Optimization further refines diffusion prompts using CLIP embeddings to reduce the fidelity gap.

Experiments

ResULIC is evaluated on the CLIC‑2020 dataset using PSNR, MS‑SSIM, LPIPS, DISTS, FID and KID. The method outperforms previous diffusion‑based approaches (e.g., PerCo) by 80.7 % in LPIPS and 66.3 % in FID, and achieves state‑of‑the‑art performance across all metrics. Ablation studies show that adaptive diffusion steps, which correlate with bitrate, further improve reconstruction quality.

Conclusion and Outlook

ResULIC demonstrates that integrating semantic residual coding with a compression‑aware diffusion model can dramatically improve visual quality at ultra‑low bitrates, providing a strong foundation for future video compression research at Kuaishou.

References

[1] Chen, T., Liu, H., Ma, Z., Shen, Q., Cao, X., and Wang, Y. End‑to‑end learnt image compression via non‑local attention optimization and improved context modeling. IEEE Transactions on Image Processing, 30:3179–3191, 2021.

[2] Lu, M., Guo, P., Shi, H., Cao, C., and Ma, Z. Transformer‑based image compression. In 2022 Data Compression Conference (DCC), pp. 469–469. IEEE, 2022.

[3] Duan, Z., Lu, M., Ma, J., Huang, Y., Ma, Z., and Zhu, F. Qarv: Quantization‑aware ResNet VAE for lossy image compression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.

[4] Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. High‑resolution image synthesis with latent diffusion models. In CVPR, 2022.

[5] Careil, M., Muckley, M. J., Verbeek, J., and Lathuilière, S. Towards image compression with perfect realism at ultra‑low bitrates. ICLR, 2024.

[6] Lei, E., Uslu, Y. B., Hassani, H., and Bidokhti, S. S. Text+sketch: Image compression at ultra low rates. ICML 2023 Workshop, 2023.

[7] Li, Z., Zhou, Y., Wei, H., Ge, C., and Jiang, J. Towards extreme image compression with latent feature guidance and diffusion prior. arXiv preprint arXiv:2404.18820, 2024.

[8] Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. Learning transferable visual models from natural language supervision. In ICML, 2021.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning diffusion model image compression ResULIC semantic coding ultra low bitrate

Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.