Tencent Technical Engineering
Tencent Technical Engineering
Oct 10, 2025 · Artificial Intelligence

How Tequila’s 1.58‑Bit Quantization Overcomes the Dead‑Zone Trap in LLMs

Tequila introduces a novel 1.58‑bit ternary quantization for large language models that tackles the dead‑zone trap by reactivating zero‑weight biases with dynamic offline offsets, achieving near‑full‑precision performance, faster convergence, and up to three‑fold CPU inference speedups.

AI inferenceLLM quantizationdynamic bias
0 likes · 9 min read
How Tequila’s 1.58‑Bit Quantization Overcomes the Dead‑Zone Trap in LLMs