How to Build a Python OCR & Image Converter with Baidu API and Pillow
Learn step‑by‑step how to use Baidu’s OCR service to extract text from images and employ the Pillow library to convert image formats in Python, including code snippets, API parameter details, and practical tips for handling local and online files.
1. Text Recognition (OCR)
First, open the Baidu AI platform and select the OCR service; note that the same Python SDK is used for speech synthesis and image recognition.
Initialize the client with your credentials:
from aip import AipOcr
APP_ID = 'your App ID'
API_KEY = 'your Api Key'
SECRET_KEY = 'your Secret Key'
client = AipOcr(APP_ID, API_KEY, SECRET_KEY)Example of recognizing a local image:
# Assuming image_path is the path to the image file
with open(image_path, 'rb') as f:
image = f.read()
result = client.basicGeneral(image)
print(result)Key request parameters:
image (string, required): base64‑encoded image data, max 4 MB, size 15 px–4096 px, formats jpg/png/bmp.
url (string, required): image URL, same size limits; ignored if image is provided.
language_type (string, optional): language to recognize, default CHN_ENG; options include EN, POR, FRE, etc.
detect_direction (bool, optional): whether to detect image orientation, default false.
detect_language (bool, optional): whether to detect language, default false.
probability (bool, optional): whether to return confidence for each line.
Important response fields:
direction (number, optional): image orientation code.
log_id (number, required): unique log identifier.
words_result_num (number, required): number of recognized words.
words_result (array, required): array of recognized text objects.
probability (object, optional): confidence information per line.
2. Image Format Converter
Using Pillow to safely convert image files instead of merely changing file extensions.
Installation: pip install pillow Import the library: from PIL import Image Utility to check if an image file is corrupted:
def isbad(path):
try:
Image.open(path).verify()
return True
except:
return FalseFunction to convert a file to PNG:
def translate(path):
if isbad(path):
try:
base, _ = path.rsplit(".", 1)
output_path = base + ".png"
im = Image.open(path)
im.save(output_path)
return True
except:
return False
else:
return FalseRunning the converter produces a usable PNG file, as shown below:
This project demonstrates practical Python techniques for OCR and image conversion, useful for automating file uploads and processing.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
