AngelSlim — 2 Technical Articles

Feb 16, 2026 · Artificial Intelligence

A New Extreme Quantization Tool for Large Models: AngelSlim’s 2‑Bit Compression

AngelSlim introduces a full‑stack large‑model compression suite that uses quantization‑aware training to shrink a 1.8B LLM to 2‑bit precision, achieving less than 4% accuracy loss, supporting a wide range of models, speculative decoding, and providing end‑to‑end deployment instructions for MacBook M4 and server environments.

AngelSlimGGUFQAT

0 likes · 13 min read

A New Extreme Quantization Tool for Large Models: AngelSlim’s 2‑Bit Compression

Tencent Technical Engineering

Jan 13, 2026 · Artificial Intelligence

Boost LLM Inference 1.9× with AngelSlim’s Speculative Decoding (Eagle3)

AngelSlim introduces a system‑wide speculative decoding framework called Eagle3 that combines lightweight draft models with parallel verification by large models, delivering up to 1.9× faster inference across LLM, vision‑language, and speech tasks while remaining open‑source and deployment‑ready.

AngelSlimEagle3LLM Acceleration

0 likes · 9 min read

Boost LLM Inference 1.9× with AngelSlim’s Speculative Decoding (Eagle3)