Artificial Intelligence 10 min read

How Suffix Prediction Boosts English‑Russian Neural Machine Translation Accuracy

Researchers introduce a novel suffix‑prediction mechanism for neural machine translation that separately generates stems and suffixes during decoding, dramatically reducing out‑of‑vocabulary errors and morphological mistakes in English‑Russian translation, achieving consistent improvements across RNN and Transformer models on large‑scale news and e‑commerce datasets.

Alibaba Cloud Developer

May 11, 2018

How Suffix Prediction Boosts English‑Russian Neural Machine Translation Accuracy

Abstract

Neural machine translation (NMT) models are limited by a fixed-size vocabulary, leading to many out‑of‑vocabulary (OOV) words, especially for morphologically rich languages such as Russian. Existing work mainly adjusts translation granularity or expands the vocabulary, but does not explicitly model morphology. This paper proposes a novel suffix‑prediction mechanism that predicts stems and suffixes separately during decoding, reducing data sparsity and morphological errors, and demonstrates stable improvements on both RNN‑based and Transformer‑based NMT systems over large‑scale datasets.

Research Background

Recent advances in NMT have shown superior performance over statistical machine translation. However, the fixed target‑side vocabulary (typically 30k‑50k words) cannot cover all forms of a morphologically rich language, causing OOV problems that severely affect translation quality.

Related Work

Previous approaches address OOV by adjusting translation granularity (subword or character‑level models) or by enlarging the target vocabulary with dynamic sub‑tables. While these methods reduce OOV rates, they do not explicitly model the morphological structure of the target language.

Neural Machine Translation

We evaluate our method on two mainstream NMT architectures: an RNN‑based encoder‑decoder (Bahdanau et al., 2015) and the Transformer (Vaswani et al., 2017).

Russian Stems and Suffixes

Russian words consist of a stem and a suffix; the suffix encodes number, case, gender, etc. By separating stems and suffixes, the number of unique stems is far smaller than the number of full word forms, and the suffix inventory contains only a few hundred types, alleviating data sparsity.

Suffix Prediction Network

During decoding, each step first generates a stem using the standard NMT decoder. Then, a feed‑forward network takes the generated stem, the decoder hidden state, and the source context to predict the suffix. The final word is obtained by concatenating the stem and suffix.

Experiments

We conducted experiments on the WMT‑2017 English‑Russian news translation task (≈5.3M sentence pairs) and on a large e‑commerce dataset (≈50M sentence pairs). Results show that our suffix‑prediction system outperforms subword and character baselines on both RNN and Transformer models.

Conclusion

We present a simple yet effective method that improves NMT for morphologically rich target languages by explicitly modeling suffixes. The approach yields consistent gains on both RNN‑based and Transformer‑based systems across news and e‑commerce domains, and represents the first work to model suffixes directly in NMT.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Transformer RNN Neural Machine Translation English-Russian Morphologically Rich Languages Suffix Prediction

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.