Top 10 Open-Source OCR Projects on GitHub Ranked by Stars
This article compiles a ranked list of ten popular open-source OCR projects on GitHub, summarizing each tool’s key capabilities—such as multimodal text extraction, PDF linearization, layout analysis, and multilingual support—along with star counts and direct repository links for developers seeking ready-to-use OCR solutions.
1. GOT‑OCR 2.0
GOT‑OCR 2.0 is an open-source end-to-end multimodal OCR model (1.43 GB) that can recognize text, mathematical formulas, molecular structures, charts, music scores, and geometric shapes. It has 7.2 K stars on GitHub.
https://github.com/Ucas-HaoranWei/GOT-OCR2.02. InternVL
InternVL, developed by OpenGVLab, is a large multimodal model comparable to GPT‑4V, capable of general image understanding and also compatible with OCR text extraction. It has 7.2 K stars.
https://github.com/OpenGVLab/InternVL3. olmOCR
olmOCR, from AllenAI, focuses on linearizing PDF documents for large-language-model training, converting complex layouts into structured text. It requires a recent NVIDIA GPU with ≥20 GB RAM and ~30 GB disk space, and has 9.8 K stars.
https://github.com/allenai/olmocr https://olmocr.allenai.org/4. Zerox
Zerox (Omni‑AI) extracts text from PDFs, images, DOCX, etc., and outputs structured Markdown without any prior training. It leverages visual models such as GPT‑4o‑mini and has 10.3 K stars.
https://github.com/getomni-ai/zerox https://getomni.ai/ocr-demo5. Surya
Surya specializes in line‑level text detection, layout analysis, reading order, and table recognition, supporting 90+ languages. It excels at table structure extraction and complex document parsing, with 16.8 K stars.
https://github.com/VikParuchuri/surya6. OCRmyPDF
OCRmyPDF adds a searchable text layer to image‑only PDFs using the Tesseract engine, supporting 100+ languages, batch processing, image de‑skewing, and cross‑platform deployment (Linux, Windows, macOS, Docker). It has 20.7 K stars.
https://github.com/ocrmypdf/OCRmyPDF https://ocrmypdf.readthedocs.io/en/latest/7. Marker
Marker converts PDFs, images, Office files, and EPUBs into Markdown, JSON, or HTML, preserving tables, formulas, and code blocks. GPU acceleration and optional LLM post‑processing improve accuracy; it has 22.8 K stars.
https://github.com/vikParuchuri/marker8. EasyOCR
EasyOCR, from JaidedAI, is a PyTorch‑based library that returns extracted text, bounding boxes, and confidence scores for over 80 languages. It works on CPU/GPU and offers a simple API. It has 26 K stars.
https://github.com/JaidedAI/EasyOCR9. Umi‑OCR
Umi‑OCR is a free, offline OCR tool for Windows 7+ x64 and Linux x64, requiring no internet connection. It supports batch processing, screenshot OCR, and various document types, with 30.8 K stars.
https://github.com/hiroi-sora/Umi-OCR10. Tesseract
Tesseract is a widely used OCR engine originally developed by HP Labs and later open‑sourced by Google. It supports more than 100 languages and serves as the backend for many other projects. It has 65.3 K stars.
https://github.com/tesseract-ocr/tesseract https://github.com/naptha/tesseract.jsSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
