How to Use the Open‑Source OCR Translator for Videos, Games, and PDFs
This guide explains how to set up and operate a free open‑source OCR‑based translator that captures on‑screen text from videos, games, or PDFs, registers the required Baidu AI API keys, configures translation sources, and demonstrates its performance on real content.
Overview
The Dango Translator is an OCR‑based tool that captures a user‑defined screen region, sends the screenshot to Baidu AI’s text‑recognition API, and forwards the recognized text to a selected translation service. It works with video subtitles, game text, animated captions, PDFs, and other on‑screen foreign language content.
Initial Setup
After cloning the repository from GitHub ( https://github.com/PantsuDango/Dango-Translator) and launching the application, configure the following parameters:
Open the Settings panel (second button on the left).
On the first page, register your own Baidu OCR API API Key and Secret , then click Save . The OCR service will not work without these credentials.
On the second page, choose a translation engine. Twelve engines are supported, including public services (Youdao, web‑based Tencent) and private services (Baidu). Private engines also require their own API keys.
Select the source language – currently supported languages are English, Japanese, and Korean.
Configure automatic translation timing, hotkeys for screenshot and translation, and visual style options such as text colour, font size, and font style.
Basic Operation
Define the screen region to translate by taking a screenshot (hotkey A by default).
The tool captures the region (automatically if configured) and sends the image to Baidu AI’s OCR endpoint.
The OCR result is sent to the chosen translation service (e.g., Baidu, Tencent, Youdao, etc.).
The translated text is overlaid on the GUI according to the visual style settings.
In practice the user only needs to screenshot the desired area; all subsequent steps are automated.
Additional Feature
A red “Music” button plays the original audio of the captured text, allowing quick verification of the source language pronunciation.
Practical Test on a PDF Document
The translator was tested on a PDF of the machine‑learning paper “Review of Text Style Transfer Based on Deep Learning”. The following configuration was used:
Translation engines: public Youdao, web‑based Tencent, and private Baidu.
Source language set to English.
Hotkeys: A for screenshot, S for translation.
Test steps:
Capture the title area of the PDF, press S, and view translations from the three engines side by side.
Repeat for the first three lines of the “Introduction” section, adjusting overlay position and opacity for readability.
The output displayed the original text followed by translations from Baidu (private), Tencent (web), and Youdao (public), highlighting noticeable differences among the services.
Resources
Source code and releases are available on GitHub:
https://github.com/PantsuDango/Dango-TranslatorITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
