How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

The article explains how Tencent's open‑source HY‑MT1.5 tackles the high‑cost, large‑parameter barrier of neural machine translation by offering a 1.8 B‑parameter model that runs on roughly 1 GB of RAM, processes 50 tokens in 0.18 s, supports 33 languages, and uses on‑policy distillation to retain top‑tier accuracy, while providing a step‑by‑step online demo and free compute credits for new users.

HyperAI Super Neural
HyperAI Super Neural
HyperAI Super Neural
How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

Problem

State‑of‑the‑art machine‑translation models are either closed‑source, billions of parameters, expensive cloud services, or lightweight open‑source models that perform poorly on low‑resource languages and domain‑specific terminology, often producing hallucinations or semantic bias.

HY‑MT1.5 family

Tencent released two open‑source models:

Tencent‑HY‑MT1.5‑1.8B – optimized for mobile/edge deployment.

Tencent‑HY‑MT1.5‑7B – high‑performance variant.

Both support bidirectional translation for 33 languages plus 5 Chinese minority languages, including Czech and Icelandic.

1.8 B model

After 8‑bit quantization the model fits in ~1 GB RAM, enabling offline real‑time translation on smartphones.

Processes 50 tokens in ~0.18 s.

On the Flores200 benchmark it surpasses medium‑size open‑source models and mainstream commercial APIs, reaching the 90th percentile of top closed‑source systems.

7 B model

Derived from Tencent’s WMT25 champion that won 30 language pairs.

Improves translation accuracy and markedly reduces hallucinations and language‑mixing errors compared with the 1.8 B model.

Technical innovation: On‑Policy Distillation

The 7 B model acts as a teacher during training, continuously guiding the 1.8 B student model and correcting its prediction bias. This on‑policy distillation lets the smaller model inherit capabilities beyond its parameter budget.

Demo access

The model can be run online via HyperAI’s tutorial at https://go.hyper.ai/I0pdR. The workflow consists of cloning the tutorial repository, selecting a GPU image (e.g., NVIDIA GeForce RTX 5090) with a PyTorch environment, launching the job, opening the Jupyter workspace, and executing the provided notebook to obtain translation results.

mobile AIlarge language modelsmachine translationTencentOn-Policy DistillationHY-MT1.5
HyperAI Super Neural
Written by

HyperAI Super Neural

Deconstructing the sophistication and universality of technology, covering cutting-edge AI for Science case studies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.