SuanNi
SuanNi
Mar 26, 2026 · Artificial Intelligence

TurboQuant: Google’s 6× KV Cache Compression With Zero Accuracy Loss

TurboQuant, a new technique from Google Research, dramatically compresses key‑value caches by up to six times without precision loss, using PolarQuant and QJL algorithms to transform vectors into polar coordinates and apply quantized Johnson‑Lindenstrauss transforms, thereby boosting inference speed and enabling longer context handling for large language models.

AI compressionKV cachePerformance
0 likes · 13 min read
TurboQuant: Google’s 6× KV Cache Compression With Zero Accuracy Loss
Data Thinking Notes
Data Thinking Notes
May 19, 2025 · Artificial Intelligence

How Model Distillation Shrinks Giant AI Models Without Losing Performance

This article explains model distillation—a technique that transfers knowledge from large teacher models to compact student models—covering its motivation, core principles, key steps, practical applications, and both its advantages and limitations, illustrating how AI can be made efficient without sacrificing performance.

AI compressionKnowledge Transfermodel distillation
0 likes · 10 min read
How Model Distillation Shrinks Giant AI Models Without Losing Performance
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Aug 24, 2022 · Artificial Intelligence

Cutting Video Bitrate to 14.4 kbps: Inside Kuaishou’s AI‑Generated Compression

Kuaishou’s audio‑video team presents an AI‑driven compression algorithm and the KISC speech codec that achieve ultra‑low‑bitrate real‑time video and high‑quality voice transmission, enabling smooth RTC experiences even on weak networks while supporting creative features like view‑point adjustment and background replacement.

AI compressionReal-time communicationlow bitrate video
0 likes · 11 min read
Cutting Video Bitrate to 14.4 kbps: Inside Kuaishou’s AI‑Generated Compression