How to Build a Python OCR & Image Converter with Baidu API and Pillow

Learn step‑by‑step how to use Baidu’s OCR service to extract text from images and employ the Pillow library to convert image formats in Python, including code snippets, API parameter details, and practical tips for handling local and online files.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Build a Python OCR & Image Converter with Baidu API and Pillow

1. Text Recognition (OCR)

First, open the Baidu AI platform and select the OCR service; note that the same Python SDK is used for speech synthesis and image recognition.

Initialize the client with your credentials:

from aip import AipOcr
APP_ID = 'your App ID'
API_KEY = 'your Api Key'
SECRET_KEY = 'your Secret Key'
client = AipOcr(APP_ID, API_KEY, SECRET_KEY)

Example of recognizing a local image:

# Assuming image_path is the path to the image file
with open(image_path, 'rb') as f:
    image = f.read()
result = client.basicGeneral(image)
print(result)

Key request parameters:

image (string, required): base64‑encoded image data, max 4 MB, size 15 px–4096 px, formats jpg/png/bmp.

url (string, required): image URL, same size limits; ignored if image is provided.

language_type (string, optional): language to recognize, default CHN_ENG; options include EN, POR, FRE, etc.

detect_direction (bool, optional): whether to detect image orientation, default false.

detect_language (bool, optional): whether to detect language, default false.

probability (bool, optional): whether to return confidence for each line.

Important response fields:

direction (number, optional): image orientation code.

log_id (number, required): unique log identifier.

words_result_num (number, required): number of recognized words.

words_result (array, required): array of recognized text objects.

probability (object, optional): confidence information per line.

2. Image Format Converter

Using Pillow to safely convert image files instead of merely changing file extensions.

Installation: pip install pillow Import the library: from PIL import Image Utility to check if an image file is corrupted:

def isbad(path):
    try:
        Image.open(path).verify()
        return True
    except:
        return False

Function to convert a file to PNG:

def translate(path):
    if isbad(path):
        try:
            base, _ = path.rsplit(".", 1)
            output_path = base + ".png"
            im = Image.open(path)
            im.save(output_path)
            return True
        except:
            return False
    else:
        return False

Running the converter produces a usable PNG file, as shown below:

This project demonstrates practical Python techniques for OCR and image conversion, useful for automating file uploads and processing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

OCRtext recognitionpillowimage-processingBaidu API
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.