Unlock Text from Images: A Hands‑On Guide to EasyOCR in Python
This article explains what OCR is, introduces the EasyOCR Python library, shows how to install it, walks through step‑by‑step usage with code examples, and summarizes the underlying deep‑learning techniques powering the library.
What is OCR?
OCR (Optical Character Recognition) is a technology that analyzes image files of text and extracts the characters and layout information, turning scanned documents, license plates, ID cards, bank cards, receipts, and many other visual texts into editable digital text.
It is one of the most common and useful AI applications in everyday life.
About EasyOCR
EasyOCR is an open‑source OCR library for Python with over 9,700 stars on GitHub. It supports more than 80 languages, including English, Simplified and Traditional Chinese, Arabic, Japanese, and continues to add more.
https://github.com/JaidedAI/EasyOCR
Installing EasyOCR
Installation is straightforward using pip or conda. pip install easyocr If you use the default PyPI source the download may be slow; using the Tsinghua mirror can complete in seconds.
How to Use EasyOCR
Using EasyOCR involves three simple steps:
Create a reader object.
Read and recognize an image.
Export the extracted text.
Example with a road‑sign image:
# Import EasyOCR
import easyocr
# Create reader object for Chinese (simplified) and English
reader = easyocr.Reader(['ch_sim', 'en'])
# Read the image
result = reader.readtext('test.jpg')
# Show result
resultThe output shows the three road names on the sign together with their pinyin.
The result is a list of tuples, each containing the bounding‑box coordinates, the recognized text, and the confidence score.
About languages: The parameter ['ch_sim','en'] tells EasyOCR to look for Simplified Chinese and English, because the road sign contains both. Multiple languages can be passed together, but not all language pairs are compatible.
About image input: Besides a file path like 'test.jpg', you can pass an OpenCV image (numpy array), raw image bytes, or an image URL.
Another example with a news‑article image containing a lot of text:
# Import EasyOCR
import easyocr
reader = easyocr.Reader(['ch_sim', 'en'])
result = reader.readtext('test1.jpg')
resultTo extract only the text strings:
for i in result:
word = i[1]
print(word)Conclusion
The library is built on research papers: the detection part uses the CRAFT algorithm, and the recognition model is a CRNN consisting of feature extraction, an LSTM sequence labeler, and a CTC decoder, all implemented with PyTorch.
The author continues to improve EasyOCR, planning to add support for more languages (aiming to cover 80‑90% of the world’s population) and to introduce handwritten text recognition while speeding up processing.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
