Artificial Intelligence 10 min read

Build a Batch Image Translation Tool with Youdao OCR API in Python

This article walks through creating a Python desktop demo that uses Youdao's OCR translation API to batch‑process cosmetic product label images, covering API credential setup, request parameters, signature generation, core code snippets, and a summary of the translation results.

MaGe Linux Operations

Jul 13, 2021

Build a Batch Image Translation Tool with Youdao OCR API in Python

The author needed to translate cosmetic product labels for a girlfriend and built a Python demo that leverages Youdao's OCR translation service to batch‑process images.

Demo Effect

The demo shows successful translation of various product images, handling mixed Korean/English text and preserving key terms such as "long‑term moisturizing" and "fixed spray".

Preparing API Credentials

Before calling the API, you must create an application on the Youdao AI Cloud console to obtain an appKey and appSecret . The process includes creating an instance, creating an application, and binding them.

API Interface

Endpoint: https://openapi.youdao.com/ocrtransapi Method: POST Request format: form data Response format: JSON

Request Parameters

type (text, required): file upload type, set to 1 for Base64.

from (text, required): source language, e.g., auto.

to (text, required): target language, e.g., auto.

appKey (text, required): application ID.

salt (text, required): UUID.

sign (text, required): MD5 of appKey+q+salt+appSecret.

ext (text, optional): audio format for translation result, supports mp3.

q (text, required when type=1): Base64‑encoded image data.

docType (text, optional): response type, currently only json.

render (text, optional): whether to return a rendered image (0 or 1).

nullIsError (text, optional): return error if no text detected ("false" or "true").

Signature Generation

Concatenate appKey, q, salt, and appSecret in that order to form a string, then compute its MD5 hash (uppercase) to obtain the sign parameter.

Result Fields

orientation – image orientation.

lanFrom – detected source language.

textAngle – image tilt angle.

errorCode – error code.

lanTo – target language.

resRegions – array of translation regions, each containing:

boundingBox – coordinates of the region.

linesCount – number of lines.

lineheight – line height.

context – original text.

linespace – line spacing.

tranContent – translated text.

Development Details

UI (tkinter)

root=tk.Tk()
root.title("netease youdao translation test")
frm = tk.Frame(root)
frm.grid(padx='50', pady='50')
btn_get_file = tk.Button(frm, text='选择待翻译图片', command=get_files)
btn_get_file.grid(row=0, column=0, ipadx='3', ipady='3', padx='10', pady='20')
text1 = tk.Text(frm, width='40', height='10')
text1.grid(row=0, column=1)
btn_get_result_path = tk.Button(frm, text='选择翻译结果路径', command=set_result_path)
btn_get_result_path.grid(row=1, column=0)
text2 = tk.Text(frm, width='40', height='2')
text2.grid(row=1, column=1)
btn_sure = tk.Button(frm, text="翻译", command=translate_files)
btn_sure.grid(row=2, column=1)
root.mainloop()

File Selection and Result Path

def get_files():
    files = filedialog.askopenfilenames(filetypes=[('text files', '.jpg')])
    translate.file_paths = files
    if files:
        for file in files:
            text1.insert(tk.END, file + '
')
            text1.update()
    else:
        print('你没有选择任何文件')

def set_result_path():
    result_path = filedialog.askdirectory()
    translate.result_root_path = result_path
    text2.insert(tk.END, result_path)

Translation Workflow

def translate_files():
    if translate.file_paths:
        translate.translate_files()
        tk.messagebox.showinfo("提示", "搞定")
    else:
        tk.messagebox.showinfo("提示", "无文件")

Batch Processing Class

class Translate():
    def __init__(self, name, file_paths, result_root_path, trans_type):
        self.name = name
        self.file_paths = file_paths  # 待翻译文件路径
        self.result_root_path = result_root_path  # 结果存放路径
        self.trans_type = trans_type

    def translate_files(self):
        for file_path in self.file_paths:
            file_name = os.path.basename(file_path)
            print('===========' + file_path + '===========' )
            trans_result = self.translate_use_netease(file_path)
            open(self.result_root_path + '/result_' + file_name.split('.')[0] + '.txt', 'w').write(trans_result)

    def translate_use_netease(self, file_content):
        result = connect(file_content)
        return result

API Call Implementation

def connect(file_content, fromLan='auto', toLan='auto'):
    f = open(file_content, 'rb')
    q = base64.b64encode(f.read()).decode('utf-8')
    f.close()
    data = {}
    data['from'] = fromLan
    data['to'] = toLan
    data['type'] = '1'
    data['q'] = q
    salt = str(uuid.uuid1())
    signStr = APP_KEY + q + salt + APP_SECRET
    sign = encrypt(signStr)
    data['appKey'] = APP_KEY
    data['salt'] = salt
    data['sign'] = sign
    response = do_request(data)
    result = json.loads(str(response.content, encoding='utf-8'))
    translateResults = result['resRegions']
    pictransresult = ""
    for i in translateResults:
        pictransresult += i['tranContent'] + "
"
    return pictransresult

Conclusion

The demo demonstrates that with an open AI platform, image OCR and translation become trivial; once the request is correctly formed, the service returns accurate translations, allowing developers to focus on UI and integration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python OCR API Tkinter image-translation youdao

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.