Artificial Intelligence 7 min read

Paddle.js OCR SDK: Text Recognition in Web Browsers

Paddle.js OCR SDK brings Baidu’s lightweight PaddleOCR models to web browsers, offering init() and recognize() APIs that load the ch_PP-OCRv2 detection (DB) and recognition (CRNN with bidirectional LSTM) models in parallel, achieving 258 ms detection, 60 ms recognition, 0.52 F‑score, and a combined size under 12 MB.

Baidu App Technology
Baidu App Technology
Baidu App Technology
Paddle.js OCR SDK: Text Recognition in Web Browsers

This article introduces Paddle.js OCR SDK, a text recognition solution running in web browsers. It explains the basic concepts of OCR (Optical Character Recognition), which includes text detection and text recognition. The SDK relies on PaddleOCR and Paddle.js technologies.

PaddleOCR is an ultra-lightweight text recognition model suite from Baidu, providing dozens of text detection and recognition models. The SDK uses ch_PP-OCRv2_det_infer for text detection and ch_PP-OCRv2_rec_infer for text recognition. Compared to previous PP-OCR versions, it offers over 7% improvement in model effect, over 220% improvement in speed, and a compact 11.6M size suitable for both server and mobile deployment.

The @paddlejs-models/ocr SDK provides two main APIs: init() for model initialization and recognize() for text recognition. The model conversion tool paddlejsconverter converts PaddlePaddle models to browser-friendly formats. The initialization process loads both detection and recognition models in parallel to reduce warm-up time.

The text detection model uses DB (Differentiable Binarization) algorithm for post-processing. The text recognition model employs CRNN (Convolutional Recurrent Neural Network) algorithm with CNN for feature extraction, RNN for sequence prediction, and CTC for transcription. It uses a two-layer bidirectional LSTM structure for improved performance.

Benchmark results show that the ch_PP-OCRv2 model takes 258ms for detection and 60ms for recognition on WebGL, with an overall F-score of 0.5224. The detection model is 3M and the recognition model is 8.6M in size.

machine learningAIOCRPaddleOCRtext recognitionPaddle.jsweb browser
Baidu App Technology
Written by

Baidu App Technology

Official Baidu App Tech Account

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.