Artificial Intelligence 10 min read

Batch Image Translation Demo Using Youdao OCR API with Python

This article presents a step‑by‑step Python demo that uses Youdao's OCR translation API to batch‑process cosmetic product images, covering API key setup, request parameters, signature generation, GUI implementation with Tkinter, and code snippets for file selection, result storage, and API invocation.

Python Programming Learning Circle

Jun 25, 2021

Batch Image Translation Demo Using Youdao OCR API with Python

The author describes a personal need to translate cosmetic product labels for a girlfriend and decides to build a batch image translation tool using Youdao's OCR translation service instead of training a model from scratch.

Effect demonstration shows several screenshots of translated results for different product images, confirming that keywords and mixed language texts are correctly recognized.

API preparation explains that an application ID and secret must be created on the Youdao platform, then bound to an instance to obtain the credentials needed for API calls.

API interface description lists the endpoint ( https://openapi.youdao.com/ocrtransapi), request method (POST), request format (form), and response format (JSON). A table of required parameters follows, including type, from, to, appKey, salt, sign, q, etc., with notes on defaults and required values.

Signature generation is detailed in a blockquote: concatenate appKey, the Base64‑encoded image q, salt, and the secret key, then compute the MD5 hash to obtain sign.

Output result describes the JSON fields returned by the API, such as orientation, lanFrom, textAngle, errorCode, lanTo, and the resRegions array containing bounding boxes, line counts, original text, and translated content.

Development process outlines three Python files: maindow.py (Tkinter GUI), transclass.py (image handling and logic), and pictranslate.py (API call). The GUI includes buttons for selecting images, choosing a result folder, and triggering translation.

Code snippets :

root=tk.Tk()
root.title("netease youdao translation test")
frm = tk.Frame(root)
frm.grid(padx='50', pady='50')
btn_get_file = tk.Button(frm, text='选择待翻译图片', command=get_files)
btn_get_file.grid(row=0, column=0, ipadx='3', ipady='3', padx='10', pady='20')
text1 = tk.Text(frm, width='40', height='10')
text1.grid(row=0, column=1)
btn_get_result_path=tk.Button(frm,text='选择翻译结果路径',command=set_result_path)
btn_get_result_path.grid(row=1,column=0)
text2=tk.Text(frm,width='40',height='2')
text2.grid(row=1,column=1)
btn_sure=tk.Button(frm,text="翻译",command=translate_files)
btn_sure.grid(row=2,column=1)
root.mainloop()

def get_files():
    files = filedialog.askopenfilenames(filetypes=[('text files', '.jpg')])
    translate.file_paths=files
    if files:
        for file in files:
            text1.insert(tk.END, file + '
')
            text1.update()
    else:
        print('你没有选择任何文件')

def set_result_path():
    result_path=filedialog.askdirectory()
    translate.result_root_path=result_path
    text2.insert(tk.END,result_path)

def translate_files():
    if translate.file_paths:
        translate.translate_files()
        tk.messagebox.showinfo("提示","搞定")
    else:
        tk.messagebox.showinfo("提示","无文件")

class Translate():
    def __init__(self,name,file_paths,result_root_path,trans_type):
        self.name=name
        self.file_paths=file_paths  # 待翻译文件路径
        self.result_root_path=result_root_path  # 结果存放路径
        self.trans_type=trans_type
    def translate_files(self):
        for file_path in self.file_paths:
            file_name=os.path.basename(file_path)
            print('==========='+file_path+'===========' )
            trans_reult=self.translate_use_netease(file_path)
            open(self.result_root_path+'/result_'+file_name.split('.')[0]+'.txt','w').write(trans_reult)
    def translate_use_netease(self,file_content):
        result= connect(file_content)
        return result

def connect(file_content,fromLan,toLan):
    f = open(file_content, 'rb')  # 二进制方式打开图文件
    q = base64.b64encode(f.read()).decode('utf-8')  # 读取文件内容，转换为base64编码
    f.close()
    data = {}
    data['from'] = 'auto'
    data['to'] = 'auto'
    data['type'] = '1'
    data['q'] = q
    salt = str(uuid.uuid1())
    signStr = APP_KEY + q + salt + APP_SECRET
    sign = encrypt(signStr)
    data['appKey'] = APP_KEY
    data['salt'] = salt
    data['sign'] = sign
    response = do_request(data)
    result=json.loads(str(response.content, encoding="utf-8"))
    translateResults=result['resRegions']
    pictransresult=""
    for i in translateResults:
        pictransresult=pictransresult+i['tranContent']+"
"
    return pictransresult

The conclusion remarks that leveraging an open platform made image recognition and natural language processing straightforward, allowing the developer to focus on showcasing the results rather than building models.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI Batch Processing OCR Tkinter image-translation Youdao API

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.