Artificial Intelligence 18 min read

Applying AI Technologies in the Youdao Dictionary Pen: Scanning, Offline Translation, and Edge ML Library

This article presents a technical overview of the Youdao Dictionary Pen, describing its hardware design, real‑time scanning and point‑query image processing, on‑device offline translation with model compression techniques, and the high‑performance Edge ML Library (EMLL) that enables efficient AI inference on constrained edge hardware.

DataFunTalk

Apr 5, 2022

Applying AI Technologies in the Youdao Dictionary Pen: Scanning, Offline Translation, and Edge ML Library

The Youdao Dictionary Pen is a compact smart hardware device designed to help students look up words, translate text, and learn English, featuring a wide‑angle camera for scanning and a high‑definition display for interactive learning.

01 Youdao Dictionary Pen Introduction

The pen’s tip houses a large‑angle camera that captures images of text for input, while the body displays queried words or translations and supports interactive picture reading.

02 Scanning and Point‑Query

Scanning differs from typical OCR because the camera captures a continuous stream of small windows at about 100 frames per second. A panoramic stitching algorithm extracts the relevant text lines from these frames in three stages: pixel‑level detection, central line grouping, and line correction, achieving roughly 98% accuracy.

The system also corrects wide‑angle distortion and uneven lighting by pre‑capturing calibration images, applying inverse geometric transforms, and enhancing illumination before OCR.

03 Offline Translation

To support offline use cases such as classrooms or travel, the translation service is deployed on‑device, eliminating network latency and preserving user privacy.

Because the original online NMT model contains over a hundred million parameters, several compression techniques are applied:

Depth and width pruning (favoring encoder layers, resulting in a 5‑layer encoder and 3‑layer decoder).

Embedding sharing across source and target vocabularies, reducing the embedding parameter count by about half.

Shared attention across layers to reuse query/key weights.

Quantization from float32 to int8 using linear mapping, dramatically reducing memory and compute.

Knowledge distillation, transferring the performance of a large teacher model to a smaller student model.

These methods together maintain translation quality (BLEU drop ≤ 0.1) while cutting memory usage by 50‑60% and increasing inference speed by 45‑67%.

04 EMLL (Edge ML Library)

EMLL is a high‑performance edge‑side machine‑learning library designed to accelerate flat‑matrix computations common in on‑device AI. It supports fp32, fp16, and int8 data types and provides assembly‑level optimizations for ARM Cortex‑A35/A53/A55/A76/A77 cores.

Benchmarks show that EMLL delivers 30%‑100% speedup over Eigen and ARM Compute Library on typical OCR, ASR, and NMT workloads, reducing latency and enabling larger models on low‑power devices.

The library is open‑source (https://github.com/netease-youdao/EMLL) and works on Linux and Android.

05 Q&A

Q: Does sharing target‑side embedding matrices increase decoding time? A: In practice the shared vocabulary is trimmed, and with EMLL the impact on speed is negligible.

Q: Can parameter sharing, quantization, and distillation be used together? A: Yes. Sharing has little speed impact, quantization speeds up inference, and distillation improves quality without affecting speed.

Finally, the Youdao team is recruiting algorithm, product, and functional engineers; interested candidates can contact Zhang Guang Yong ([email protected]).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

edge computing AI model compression OCR Edge ML Library offline translation

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.