Artificial Intelligence 9 min read

Top 10 Open-Source OCR Projects on GitHub Ranked by Stars

This article compiles a ranked list of ten popular open-source OCR projects on GitHub, summarizing each tool’s key capabilities—such as multimodal text extraction, PDF linearization, layout analysis, and multilingual support—along with star counts and direct repository links for developers seeking ready-to-use OCR solutions.

Liangxu Linux

Apr 22, 2025

Top 10 Open-Source OCR Projects on GitHub Ranked by Stars

1. GOT‑OCR 2.0

GOT‑OCR 2.0 is an open-source end-to-end multimodal OCR model (1.43 GB) that can recognize text, mathematical formulas, molecular structures, charts, music scores, and geometric shapes. It has 7.2 K stars on GitHub.

https://github.com/Ucas-HaoranWei/GOT-OCR2.0

2. InternVL

InternVL, developed by OpenGVLab, is a large multimodal model comparable to GPT‑4V, capable of general image understanding and also compatible with OCR text extraction. It has 7.2 K stars.

https://github.com/OpenGVLab/InternVL

3. olmOCR

olmOCR, from AllenAI, focuses on linearizing PDF documents for large-language-model training, converting complex layouts into structured text. It requires a recent NVIDIA GPU with ≥20 GB RAM and ~30 GB disk space, and has 9.8 K stars.

https://github.com/allenai/olmocr

https://olmocr.allenai.org/

4. Zerox

Zerox (Omni‑AI) extracts text from PDFs, images, DOCX, etc., and outputs structured Markdown without any prior training. It leverages visual models such as GPT‑4o‑mini and has 10.3 K stars.

https://github.com/getomni-ai/zerox

https://getomni.ai/ocr-demo

5. Surya

Surya specializes in line‑level text detection, layout analysis, reading order, and table recognition, supporting 90+ languages. It excels at table structure extraction and complex document parsing, with 16.8 K stars.

https://github.com/VikParuchuri/surya

6. OCRmyPDF

OCRmyPDF adds a searchable text layer to image‑only PDFs using the Tesseract engine, supporting 100+ languages, batch processing, image de‑skewing, and cross‑platform deployment (Linux, Windows, macOS, Docker). It has 20.7 K stars.

https://github.com/ocrmypdf/OCRmyPDF

https://ocrmypdf.readthedocs.io/en/latest/

7. Marker

Marker converts PDFs, images, Office files, and EPUBs into Markdown, JSON, or HTML, preserving tables, formulas, and code blocks. GPU acceleration and optional LLM post‑processing improve accuracy; it has 22.8 K stars.

https://github.com/vikParuchuri/marker

8. EasyOCR

EasyOCR, from JaidedAI, is a PyTorch‑based library that returns extracted text, bounding boxes, and confidence scores for over 80 languages. It works on CPU/GPU and offers a simple API. It has 26 K stars.

https://github.com/JaidedAI/EasyOCR

9. Umi‑OCR

Umi‑OCR is a free, offline OCR tool for Windows 7+ x64 and Linux x64, requiring no internet connection. It supports batch processing, screenshot OCR, and various document types, with 30.8 K stars.

https://github.com/hiroi-sora/Umi-OCR

10. Tesseract

Tesseract is a widely used OCR engine originally developed by HP Labs and later open‑sourced by Google. It supports more than 100 languages and serves as the backend for many other projects. It has 65.3 K stars.

https://github.com/tesseract-ocr/tesseract

https://github.com/naptha/tesseract.js

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

computer vision OCR GitHub Multimodal open-source

Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

1. GOT‑OCR 2.0

2. InternVL

3. olmOCR

4. Zerox

5. Surya

6. OCRmyPDF

7. Marker

8. EasyOCR

9. Umi‑OCR

10. Tesseract

Liangxu Linux

How this landed with the community

Was this worth your time?

0 Comments

1. GOT‑OCR 2.0