Artificial Intelligence 12 min read

Batch Image Translation Demo Using Youdao OCR API with Python

This article demonstrates how to build a Python desktop application that batch‑processes cosmetic product images, sends them to Youdao's OCR translation service, and displays the translated text, covering API preparation, request parameters, signature generation, and full source code.

Python Programming Learning Circle

Dec 13, 2024

Batch Image Translation Demo Using Youdao OCR API with Python

The author describes a personal need to translate cosmetic product labels for a girlfriend and decides to develop a batch image translation demo using Youdao's AI‑powered OCR translation service.

Effect Demonstration

Several screenshots show successful translation results for different product images, including mixed Korean/English text.

Preparing API Call – Generating App ID and Secret

Instructions explain how to create an instance and application on the Youdao platform to obtain the required appKey and appSecret.

API Interface Introduction

The OCR translation endpoint is https://openapi.youdao.com/ocrtransapi, accessed via POST with form data and JSON response.

API Call Parameters

Field Name

Type

Meaning

Required

Notes

type

text

File upload type

True

Set to 1 for Base64

from

text

Source language

True

Can be "auto"

text

Target language

True

Can be "auto"

appKey

text

Application ID

True

Find in application management

salt

text

UUID

True

e.g., 1995882C5064805BC30A39829B779D7B

sign

text

Signature

True

MD5(appKey+q+salt+appSecret)

ext

text

Audio format for result

False

mp3

text

Image to recognize

True

Base64 of image when type=1

docType

text

Response type

False

json

render

text

Return rendered image

False

0 (no) or 1 (yes)

nullIsError

text

Return error if no text detected

False

"false" or "true"

Signature generation steps: concatenate appKey , q , salt , and appSecret in that order, then compute the MD5 hash to obtain sign .

Development Process

1. API Interface Overview

The core of the project is calling the Youdao OCR translation API.

2. Detailed Development

The demo consists of three Python files: maindow.py (Tkinter UI), transclass.py (image handling), and pictranslate.py (API wrapper).

UI Code (maindow.py)

root=tk.Tk()
root.title("netease youdao translation test")
frm = tk.Frame(root)
frm.grid(padx='50', pady='50')
btn_get_file = tk.Button(frm, text='选择待翻译图片', command=get_files)
btn_get_file.grid(row=0, column=0, ipadx='3', ipady='3', padx='10', pady='20')
text1 = tk.Text(frm, width='40', height='10')
text1.grid(row=0, column=1)
btn_get_result_path=tk.Button(frm,text='选择翻译结果路径',command=set_result_path)
btn_get_result_path.grid(row=1,column=0)
text2=tk.Text(frm,width='40', height='2')
text2.grid(row=1,column=1)
btn_sure=tk.Button(frm,text="翻译",command=translate_files)
btn_sure.grid(row=2,column=1)
root.mainloop()

File selection, result path selection, and translation trigger are implemented with Tkinter dialogs.

def get_files():
    files = filedialog.askopenfilenames(filetypes=[('text files', '.jpg')])
    translate.file_paths = files
    if files:
        for file in files:
            text1.insert(tk.END, file + '
')
            text1.update()
    else:
        print('你没有选择任何文件')

def set_result_path():
    result_path = filedialog.askdirectory()
    translate.result_root_path = result_path
    text2.insert(tk.END, result_path)

def translate_files():
    if translate.file_paths:
        translate.translate_files()
        tk.messagebox.showinfo("提示", "搞定")
    else:
        tk.messagebox.showinfo("提示", "无文件")

Batch Image Processing (transclass.py)

class Translate():
    def __init__(self, name, file_paths, result_root_path, trans_type):
        self.name = name
        self.file_paths = file_paths
        self.result_root_path = result_root_path
        self.trans_type = trans_type
    def translate_files(self):
        for file_path in self.file_paths:
            file_name = os.path.basename(file_path)
            print('===========' + file_path + '===========' )
            trans_reult = self.translate_use_netease(file_path)
            open(self.result_root_path + '/result_' + file_name.split('.')[0] + '.txt','w').write(trans_reult)
    def translate_use_netease(self, file_content):
        result = connect(file_content)
        return result

Calling Youdao API (pictranslate.py)

def connect(file_content, fromLan, toLan):
    f = open(file_content, 'rb')
    q = base64.b64encode(f.read()).decode('utf-8')
    f.close()
    data = {}
    data['from'] = 'auto'
    data['to'] = 'auto'
    data['type'] = '1'
    data['q'] = q
    salt = str(uuid.uuid1())
    signStr = APP_KEY + q + salt + APP_SECRET
    sign = encrypt(signStr)
    data['appKey'] = APP_KEY
    data['salt'] = salt
    data['sign'] = sign
    response = do_request(data)
    result = json.loads(str(response.content, encoding="utf-8"))
    translateResults = result['resRegions']
    pictransresult = ""
    for i in translateResults:
        pictransresult = pictransresult + i['tranContent'] + "
"
    return pictransresult

Result Summary

The JSON response contains fields such as orientation, lanFrom, lanTo, and resRegions with detailed translation content, bounding boxes, and layout information.

Field

Description

orientation

Image orientation

lanFrom

Detected source language

textAngle

Image tilt angle

errorCode

Error code

lanTo

Target language

resRegions

Translated content per region

The author concludes that leveraging an open AI platform makes image recognition and natural language processing straightforward, allowing more time for personal enjoyment.

Project repository: https://github.com/LemonQH/BatchPicTranslate

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning Batch Processing OCR Tkinter image-translation Youdao API

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.