Artificial Intelligence 7 min read

Paddle.js OCR SDK: Text Recognition in Web Browsers

Paddle.js OCR SDK brings Baidu’s lightweight PaddleOCR models to web browsers, offering init() and recognize() APIs that load the ch_PP-OCRv2 detection (DB) and recognition (CRNN with bidirectional LSTM) models in parallel, achieving 258 ms detection, 60 ms recognition, 0.52 F‑score, and a combined size under 12 MB.

Baidu App Technology

Dec 7, 2021

Paddle.js OCR SDK: Text Recognition in Web Browsers

This article introduces Paddle.js OCR SDK, a text recognition solution running in web browsers. It explains the basic concepts of OCR (Optical Character Recognition), which includes text detection and text recognition. The SDK relies on PaddleOCR and Paddle.js technologies.

PaddleOCR is an ultra-lightweight text recognition model suite from Baidu, providing dozens of text detection and recognition models. The SDK uses ch_PP-OCRv2_det_infer for text detection and ch_PP-OCRv2_rec_infer for text recognition. Compared to previous PP-OCR versions, it offers over 7% improvement in model effect, over 220% improvement in speed, and a compact 11.6M size suitable for both server and mobile deployment.

The @paddlejs-models/ocr SDK provides two main APIs: init() for model initialization and recognize() for text recognition. The model conversion tool paddlejsconverter converts PaddlePaddle models to browser-friendly formats. The initialization process loads both detection and recognition models in parallel to reduce warm-up time.

The text detection model uses DB (Differentiable Binarization) algorithm for post-processing. The text recognition model employs CRNN (Convolutional Recurrent Neural Network) algorithm with CNN for feature extraction, RNN for sequence prediction, and CTC for transcription. It uses a two-layer bidirectional LSTM structure for improved performance.

Benchmark results show that the ch_PP-OCRv2 model takes 258ms for detection and 60ms for recognition on WebGL, with an overall F-score of 0.5224. The detection model is 3M and the recognition model is 8.6M in size.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning AI OCR PaddleOCR text recognition Paddle.js Web Browser

Written by

Baidu App Technology

Official Baidu App Tech Account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.