Artificial Intelligence 11 min read

PaddleOCR 3.1 Unveils Multilingual PP‑OCRv5, Document Translation, and MCP Server Integration

PaddleOCR 3.1 introduces three major upgrades—a multilingual PP‑OCRv5 model supporting 37 languages with over 30% accuracy gain, a PP‑DocTranslation pipeline for high‑quality multi‑language document translation, and MCP server support for flexible AI application integration—accompanied by detailed CLI usage, demo scenarios, and open‑source resources.

Baidu Geek Talk

Jul 9, 2025

PaddleOCR 3.1 Unveils Multilingual PP‑OCRv5, Document Translation, and MCP Server Integration

PaddleOCR 3.1 was released shortly after the launch of version 3.0 and brings three major upgrades:

Multilingual PP‑OCRv5 model : Supports 37 languages (including French, Spanish, Portuguese, Russian, Korean, etc.) with an average recognition accuracy increase of more than 30%. The model leverages the Wenxin 4.5 multimodal capabilities to automatically generate high‑quality training data, addressing data scarcity and annotation cost.

PP‑DocTranslation pipeline : Built on PP‑StructureV3 and Wenxin 4.5, it can translate Markdown, PDF, and image documents, allowing users to provide custom terminology tables for fine‑grained multilingual translation.

MCP server support : Users can quickly set up an MCP server to expose PaddleOCR core capabilities (text detection, OCR, document parsing) via local Python libraries, cloud services, or self‑hosted deployments, enabling seamless integration with downstream AI applications.

Key Technical Steps

Automatic line detection and cropping : PP‑OCRv5 detection model locates and crops each text line for standardized input.

High‑confidence text recognition : Wenxin 4.5 performs multiple independent recognitions per line, selecting consistent results to improve annotation accuracy and reduce human bias.

CLI Usage Examples

# Use the --lang parameter to run OCR for French
paddleocr -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_french01.png \
    --lang fr \
    --use_doc_orientation False \
    --use_doc_unwarping False \
    --use_textline_orientation False \
    --save_path ./output \
    --device gpu:0

For document translation, the --target_language flag specifies the output language:

paddleocr pp_doctranslation -i vehicle_certificate-1.png --target_language en --qianfan_api_key your_api_key

MCP Server Capabilities

Text recognition : Detects and recognizes text in images and PDFs, returning JSON with coordinates and content.

Document parsing : Extracts blocks, titles, paragraphs, images, tables, and outputs structured Markdown and JSON.

The server supports three deployment modes: local Python library, Baidu Star River community service, and self‑hosted service, with both stdio and Streamable HTTP transport mechanisms.

Demo Scenarios

Demo 1 : In Claude for Desktop, extract handwritten content from images and sync it to Notion using the MCP server.

Demo 2 : Convert handwritten sketches or pseudo‑code in VSCode into style‑compliant Python scripts and push them to a GitHub repository.

Demo 3 : Transform PDFs or images containing complex tables, formulas, and handwritten text into editable Word or Excel files.

Conclusion

Since the release of PaddleOCR 3.0, extensive feedback on multilingual recognition and MCP support has driven the development of PaddleOCR 3.1. Developers, researchers, and industry users are encouraged to try the new version, provide feedback, and contribute to the open‑source repository.

Open‑source repository: https://github.com/PaddlePaddle/PaddleOCR

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

computer vision AI MCP OCR multilingual PaddleOCR document translation

Written by

Baidu Geek Talk

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.