Improving Passport OCR: Process, Preprocessing, and Prior Knowledge Corrections
This article outlines a comprehensive OCR workflow for passport recognition, covering image acquisition, preprocessing techniques, engine integration, and prior‑knowledge corrections to enhance accuracy and user experience, while sharing practical insights and performance results.
The author describes two OCR modes—traditional pattern recognition and a machine‑learning neural‑network approach—and focuses on the former for passport recognition, detailing three iterative improvements that led to a satisfactory result.
Recognition Process consists of four stages: image acquisition, image preprocessing, engine recognition, and prior‑knowledge correction, all aimed at improving user experience.
Image Acquisition offers two methods: single‑shot capture, which is fast but lacks quality checks, and real‑time video streaming, which allows continuous verification and reduces wasted effort despite longer processing time.
Cropping Methods include manual target‑area alignment (high accuracy but slow), fully automatic region detection (simple but not 100% reliable), and a hybrid approach that combines automatic detection with manual adjustment for difficult edges, yielding the best overall experience.
User Experience emphasizes minimizing user effort while maximizing recognition accuracy; improving the algorithm can reduce required user actions.
Image Preprocessing involves passport‑type classification, interest‑region detection, and several image‑processing steps such as grayscale conversion, binarization, erosion, adaptive thresholding, noise removal, rotation correction, and trimming of white borders to obtain an optimal image.
Specific fields like MRZ1, MRZ2, region, and Chinese name are extracted using tailored strategies: multiple threshold levels for binarization, adaptive thresholds with morphological operations, and region‑based segmentation to handle both single‑line and multi‑line layouts.
Engine Recognition leverages third‑party OCR engines with custom white‑list/black‑list configurations for different fields (e.g., Chinese names, locations, MRZ lines), using Chinese and English engines as appropriate.
Prior Knowledge Correction applies validation rules, cross‑checking between MRZ and name fields, fuzzy matching for English names, similarity scoring for locations, and multi‑pass recognition for Chinese names to resolve ambiguities and boost correctness.
Final Recognition Results show that under good lighting and clear images, the system achieves fast initialization (≈1.7 s) and subsequent recognitions within 0.7–0.9 s, handling most passport conditions except severely damaged or rare characters.
The article concludes with additional practical tricks and a small linguistic note (e.g., “SHAANXI” vs. “SHANXI”).
Tongcheng Travel Technology Center
Pursue excellence, start again with Tongcheng! More technical insights to help you along your journey and make development enjoyable.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.