Old Zhang's AI Learning
Jan 31, 2026 · Artificial Intelligence
How a 0.1B‑Parameter OCR Model Beats Multi‑Billion‑Parameter Vision‑Language Models
UniRec‑0.1B, a lightweight OCR model with only 0.1 B parameters, achieves accuracy comparable to or better than multi‑billion‑parameter visual‑language models across text, formula, and mixed‑content tasks, thanks to hierarchical supervision training, a semantic‑decoupled tokenizer, and a large 40 M‑sample dataset, while delivering 2‑9× faster inference and full open‑source availability.
Hierarchical SupervisionOCRSemantic Decoupled Tokenizer
0 likes · 12 min read
