Artificial Intelligence 15 min read

Learning OCR for Game Text Recognition: From Data Preparation to CRNN Model Training

This article documents the author’s step‑by‑step journey of building an OCR system for recognizing Chinese characters in a card‑game UI, covering game selection, technical background, data generation, deep‑learning model training with CRNN, real‑image data collection, optimization attempts, and final performance evaluation.

NetEase LeiHuo Testing Center

Apr 1, 2022

Learning OCR for Game Text Recognition: From Data Preparation to CRNN Model Training

Optical Character Recognition (OCR) traditionally processes scanned document images, but the author expands the definition to include any image‑based text detection and recognition. The article describes the author’s learning process focused on recognizing Chinese characters in the UI of the card game "忘川风华录" (Wangchuan).

1. Preparation – The author chose a card game with abundant UI text as a beginner‑friendly target and introduced the two‑step OCR pipeline: image text detection and image text recognition.

2. Technical Background – Two detection approaches were tested: (a) bounding‑box generation per character using Baidu OCR API, and (b) line‑level detection using EasyOCR’s CRAFT algorithm. For recognition, traditional methods (HOG, SIFT + SVM) and deep‑learning methods (CNN‑based models such as CRNN) were discussed, with the decision to focus on deep learning.

3. Goal Definition – After evaluating open‑source projects, the author selected EasyOCR as a baseline and the CRNN implementation (crnn.pytorch) for custom training, aiming to surpass EasyOCR’s accuracy on the game’s text.

4. Learning Process – Initial experiments with a small alphanumeric character set achieved ~0.91 accuracy after 90 epochs. Scaling to the game’s full character set (≈8,100 characters) required data‑generation optimizations, including pre‑generating one million synthetic images to halve training time.

5. Training Data Optimization – Analysis of synthetic data revealed uniform color, fixed font size, and variable length/position. To better match game conditions, the author adjusted color palettes, fixed horizontal alignment, limited font size variations, and refined bounding‑box generation. Synthetic data after these tweaks yielded up to 0.88 accuracy, but real‑world testing remained low (~0.38).

6. Using Real Images – Approximately 500 game screenshots were captured (1 fps), segmented with EasyOCR, manually verified, and deduplicated to form a real dataset. Two padding strategies for resizing were explored (zero‑padding vs. edge‑pixel tiling), achieving training accuracies of 0.961 and 0.984 respectively, though the limited dataset constrained generalization.

7. Revisiting Model Architecture – The author realized that random character order in synthetic data destroyed sequential patterns crucial for the CRNN’s Bi‑LSTM component. By constructing a text corpus from real game strings and generating images that preserve natural character sequences, the model’s accuracy on real data improved to 0.804, and with a larger 100k‑image corpus to about 0.816.

8. Error Analysis – Remaining errors stem from incomplete right‑most punctuation, confusing similar characters (e.g., "0", "O", "Q"), CTC decoding issues with repeated symbols, and insufficient coverage of rare punctuation.

9. Conclusion – The final CRNN model achieved ~81.6% accuracy on the game’s text, surpassing EasyOCR’s 74.4% baseline and meeting the original learning goal. The author highlights the importance of high‑quality data, understanding deep‑learning model components, careful use of open‑source code, and balancing model complexity with practical constraints.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

data augmentation deep learning Image processing OCR CRNN EasyOCR Game Text Recognition

Written by

NetEase LeiHuo Testing Center

LeiHuo Testing Center provides high-quality, efficient QA services, striving to become a leading testing team in China.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.