Build a Batch Image Translation Tool with Youdao OCR API in Python
This article walks through creating a Python desktop demo that uses Youdao's OCR translation API to batch‑process cosmetic product label images, covering API credential setup, request parameters, signature generation, core code snippets, and a summary of the translation results.
The author needed to translate cosmetic product labels for a girlfriend and built a Python demo that leverages Youdao's OCR translation service to batch‑process images.
Demo Effect
The demo shows successful translation of various product images, handling mixed Korean/English text and preserving key terms such as "long‑term moisturizing" and "fixed spray".
Preparing API Credentials
Before calling the API, you must create an application on the Youdao AI Cloud console to obtain an appKey and appSecret . The process includes creating an instance, creating an application, and binding them.
API Interface
Endpoint: https://openapi.youdao.com/ocrtransapi Method: POST Request format: form data Response format: JSON
Request Parameters
type (text, required): file upload type, set to 1 for Base64.
from (text, required): source language, e.g., auto.
to (text, required): target language, e.g., auto.
appKey (text, required): application ID.
salt (text, required): UUID.
sign (text, required): MD5 of appKey+q+salt+appSecret.
ext (text, optional): audio format for translation result, supports mp3.
q (text, required when type=1): Base64‑encoded image data.
docType (text, optional): response type, currently only json.
render (text, optional): whether to return a rendered image (0 or 1).
nullIsError (text, optional): return error if no text detected ("false" or "true").
Signature Generation
Concatenate appKey, q, salt, and appSecret in that order to form a string, then compute its MD5 hash (uppercase) to obtain the sign parameter.
Result Fields
orientation – image orientation.
lanFrom – detected source language.
textAngle – image tilt angle.
errorCode – error code.
lanTo – target language.
resRegions – array of translation regions, each containing:
boundingBox – coordinates of the region.
linesCount – number of lines.
lineheight – line height.
context – original text.
linespace – line spacing.
tranContent – translated text.
Development Details
UI (tkinter)
root=tk.Tk()
root.title("netease youdao translation test")
frm = tk.Frame(root)
frm.grid(padx='50', pady='50')
btn_get_file = tk.Button(frm, text='选择待翻译图片', command=get_files)
btn_get_file.grid(row=0, column=0, ipadx='3', ipady='3', padx='10', pady='20')
text1 = tk.Text(frm, width='40', height='10')
text1.grid(row=0, column=1)
btn_get_result_path = tk.Button(frm, text='选择翻译结果路径', command=set_result_path)
btn_get_result_path.grid(row=1, column=0)
text2 = tk.Text(frm, width='40', height='2')
text2.grid(row=1, column=1)
btn_sure = tk.Button(frm, text="翻译", command=translate_files)
btn_sure.grid(row=2, column=1)
root.mainloop()File Selection and Result Path
def get_files():
files = filedialog.askopenfilenames(filetypes=[('text files', '.jpg')])
translate.file_paths = files
if files:
for file in files:
text1.insert(tk.END, file + '
')
text1.update()
else:
print('你没有选择任何文件')
def set_result_path():
result_path = filedialog.askdirectory()
translate.result_root_path = result_path
text2.insert(tk.END, result_path)Translation Workflow
def translate_files():
if translate.file_paths:
translate.translate_files()
tk.messagebox.showinfo("提示", "搞定")
else:
tk.messagebox.showinfo("提示", "无文件")Batch Processing Class
class Translate():
def __init__(self, name, file_paths, result_root_path, trans_type):
self.name = name
self.file_paths = file_paths # 待翻译文件路径
self.result_root_path = result_root_path # 结果存放路径
self.trans_type = trans_type
def translate_files(self):
for file_path in self.file_paths:
file_name = os.path.basename(file_path)
print('===========' + file_path + '===========' )
trans_result = self.translate_use_netease(file_path)
open(self.result_root_path + '/result_' + file_name.split('.')[0] + '.txt', 'w').write(trans_result)
def translate_use_netease(self, file_content):
result = connect(file_content)
return resultAPI Call Implementation
def connect(file_content, fromLan='auto', toLan='auto'):
f = open(file_content, 'rb')
q = base64.b64encode(f.read()).decode('utf-8')
f.close()
data = {}
data['from'] = fromLan
data['to'] = toLan
data['type'] = '1'
data['q'] = q
salt = str(uuid.uuid1())
signStr = APP_KEY + q + salt + APP_SECRET
sign = encrypt(signStr)
data['appKey'] = APP_KEY
data['salt'] = salt
data['sign'] = sign
response = do_request(data)
result = json.loads(str(response.content, encoding='utf-8'))
translateResults = result['resRegions']
pictransresult = ""
for i in translateResults:
pictransresult += i['tranContent'] + "
"
return pictransresultConclusion
The demo demonstrates that with an open AI platform, image OCR and translation become trivial; once the request is correctly formed, the service returns accurate translations, allowing developers to focus on UI and integration.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
