Tagged articles

OCR

241 articles · Page 2 of 3

Sep 12, 2024 · Artificial Intelligence

Master Double-Digit OCR with ddddocr: Deep Learning Library for PHP & Python

This article introduces ddddocr, an open‑source deep‑learning OCR library for recognizing double‑digit numbers, explains its background, key features, installation steps, and provides detailed PHP examples for basic OCR, target detection, and slider detection functionalities.

OCRPHPPython

0 likes · 9 min read

Master Double-Digit OCR with ddddocr: Deep Learning Library for PHP & Python

DeWu Technology

Sep 11, 2024 · Frontend Development

Advanced Watermark Techniques and OCR Integration for Front-End Applications

The article details progressive front‑end watermark schemes—from a basic canvas overlay to mutation‑observer‑protected, hide‑ and cover‑resistant, and low‑opacity dark watermarks—and explains how adaptive tone handling, contrast tuning, region cropping, and a hybrid OCR pipeline (internal service with tesseract.js fallback) ensure robust, invisible data protection and accurate screenshot analysis.

CanvasFront-endImage processing

0 likes · 20 min read

Advanced Watermark Techniques and OCR Integration for Front-End Applications

Java Architect Essentials

Sep 6, 2024 · Artificial Intelligence

Integrating Tess4J OCR into a Spring Boot Application

This guide explains how to set up a Spring Boot project, add the Tess4J dependency, configure language data, implement an OCR service and REST controller, and test both local file uploads and remote image URLs for text recognition.

Image processingJavaOCR

0 likes · 6 min read

Integrating Tess4J OCR into a Spring Boot Application

Python Programming Learning Circle

Sep 4, 2024 · Artificial Intelligence

Building an Automatic Math Grading System with Python: Data Generation, CNN Training, Image Segmentation, and Result Feedback

This tutorial explains how to create an automatic math‑grading tool in Python by generating synthetic digit images, training a small CNN on the data, segmenting handwritten equations with projection techniques, recognizing characters, evaluating the expressions, and overlaying the results back onto the original image.

AutomationCNNImage processing

0 likes · 30 min read

Building an Automatic Math Grading System with Python: Data Generation, CNN Training, Image Segmentation, and Result Feedback

Full-Stack Cultivation Path

Aug 8, 2024 · Artificial Intelligence

MegaParse: A Precision Document Parser Built for LLMs

MegaParse is an open‑source document parser that transforms PDFs, Word, PPT, Excel and CSV files into LLM‑friendly formats, preserving full information, boosting processing efficiency, and enabling deeper semantic analysis, with quick‑start installation steps and a roadmap for future features.

AI toolsDocument ParsingLLM

0 likes · 4 min read

MegaParse: A Precision Document Parser Built for LLMs

Full-Stack Cultivation Path

Jul 17, 2024 · Artificial Intelligence

Open-Source PDF Toolkit Delivers High-Accuracy Layout and Formula Detection

PDF‑Extract‑Kit is an open‑source toolkit that combines high‑accuracy layout detection, formula detection, formula recognition, and OCR for PDFs, and the article details its model comparisons, evaluation on academic and textbook datasets, and step‑by‑step instructions for running it on Windows or macOS, including Apple Silicon.

OCRPDF-Extract-Kitcomputer vision

0 likes · 6 min read

Open-Source PDF Toolkit Delivers High-Accuracy Layout and Formula Detection

Meituan Technology Team

Jun 13, 2024 · Artificial Intelligence

Overview of Meituan's Selected CVPR 2024 Papers and Online Sharing Event

Meituan's tech team highlights seven CVPR 2024 papers—spanning OCR pre‑training, long‑tail semi‑supervised learning, visual AIGC, audio‑visual segmentation and synthetic‑data detection—provides detailed abstracts and experimental results, and announces an online author‑talk session on June 27.

Audio-Visual SegmentationCVPR 2024OCR

0 likes · 18 min read

Overview of Meituan's Selected CVPR 2024 Papers and Online Sharing Event

LuTiao Programming

Apr 24, 2024 · Backend Development

Building a License Plate Recognition Service with Spring Boot 3.x and OCR

This article walks through creating a server‑side license‑plate recognition system using Spring Boot 3.x, the open‑source Tesseract OCR library, and OpenCV for image preprocessing, covering project goals, Maven dependencies, core service implementations, special‑plate handling, and a REST API controller.

JavaOCRbackend

0 likes · 8 min read

Building a License Plate Recognition Service with Spring Boot 3.x and OCR

Python Programming Learning Circle

Apr 18, 2024 · Artificial Intelligence

Implementing an Automatic Math Expression Grading System with Python and Convolutional Neural Networks

This tutorial walks through building a self‑trained OCR pipeline that generates synthetic digit images, trains a CNN model, segments handwritten math expressions, predicts each character, evaluates the arithmetic result, and overlays checkmarks, crosses or answers onto the original image.

AutomationCNNImage processing

0 likes · 28 min read

Implementing an Automatic Math Expression Grading System with Python and Convolutional Neural Networks

The Dominant Programmer

Mar 30, 2024 · Backend Development

Implement OCR in Spring Boot with Tess4J for Image Text Recognition

This guide shows how to integrate the open‑source Tesseract OCR engine into a Spring Boot application using the Tess4J Java wrapper, covering Chinese language data setup, Maven dependency configuration, bean creation, service implementation, and a unit test to verify image text extraction.

OCRSpring Bootimage recognition

0 likes · 6 min read

Implement OCR in Spring Boot with Tess4J for Image Text Recognition

Top Architect

Mar 13, 2024 · Backend Development

Integrating Tess4J OCR into a Spring Boot Backend Service

This tutorial walks through setting up a Spring Boot backend, adding the Tess4J OCR library, creating a service and REST controller to recognize text from both local files and remote image URLs, and provides testing steps and deployment tips.

JavaOCRREST API

0 likes · 8 min read

Integrating Tess4J OCR into a Spring Boot Backend Service

Top Architect

Mar 6, 2024 · Backend Development

Integrating Tess4J OCR into a Spring Boot Backend Service

This guide demonstrates how to integrate Tess4J OCR into a Spring Boot application, covering environment setup, Maven dependencies, adding language data, creating an OCR service class, building REST endpoints for local and remote image processing, and testing the solution.

JavaOCRREST

0 likes · 8 min read

Code Ape Tech Column

Feb 2, 2024 · Artificial Intelligence

Integrating Tess4J OCR into a Spring Boot Application

This guide walks through setting up a Spring Boot project, adding Tess4J dependencies, configuring language data, implementing an OCR service class, exposing REST endpoints for local and remote image recognition, and testing the OCR functionality end‑to‑end.

JavaOCRREST API

0 likes · 6 min read

Test Development Learning Exchange

Jan 21, 2024 · Fundamentals

How to Extract MP3 Files from a PDF Using Python

This guide explains step‑by‑step how to install required Python libraries, extract text and images from a PDF, perform OCR on the images, locate embedded MP3 data in the combined text, and save the audio file, providing complete sample code for each stage.

MP3 extractionOCRPython

0 likes · 4 min read

How to Extract MP3 Files from a PDF Using Python

Open Source Tech Hub

Jan 20, 2024 · Artificial Intelligence

How to Set Up ModelScope with Anaconda and Run OCR Inference via PHP

This guide walks through installing Anaconda, creating a Python 3.10 conda environment, adding PyTorch and ModelScope libraries, installing domain-specific dependencies, verifying NLP pipelines, and using PHPY to call ModelScope's OCR model from PHP, complete with code snippets and troubleshooting tips.

AI inferenceAnacondaModelScope

0 likes · 10 min read

How to Set Up ModelScope with Anaconda and Run OCR Inference via PHP

Test Development Learning Exchange

Jan 4, 2024 · Artificial Intelligence

Solving Image Captchas in Selenium Automation with Python and OCR

This tutorial demonstrates how to use Python's urllib to download captcha images, apply pytesseract OCR for text extraction, and integrate the result into Selenium scripts to automate the entry of image captchas during web testing.

AutomationOCRPython

0 likes · 4 min read

Solving Image Captchas in Selenium Automation with Python and OCR

Sohu Tech Products

Dec 27, 2023 · Artificial Intelligence

OCR-Based Video Review System: Technology Selection, Optimization, and Model Fine-Tuning

An OCR‑based video review system using PaddleOCR’s DB detector and SVTR recognizer, combined with multi‑level frame deduplication, message‑queue task decoupling, Redis prioritization, and dynamic thread‑pool scheduling, was fine‑tuned on 5 000 samples to cut daily frames from 794 million to 3.6 million, achieving automated detection of over 230 abnormal videos per day and replacing three manual reviewers, with future plans for GPU acceleration and cross‑instance GRPC dispatch.

AIOCRPaddleOCR

0 likes · 20 min read

OCR-Based Video Review System: Technology Selection, Optimization, and Model Fine-Tuning

Tencent Tech

Oct 20, 2023 · Artificial Intelligence

Tencent OCR's AI Triumph at ICDAR 2023: Four Championship Wins

At ICDAR 2023, Tencent's OCR team leveraged self‑developed algorithms and large‑model backbones to clinch four official championship titles across the DSText and SVRD tracks, showcasing breakthroughs in dense video text detection, tracking, end‑to‑end recognition, and structured information extraction.

ICDAR 2023OCRStructured Information Extraction

0 likes · 14 min read

Tencent OCR's AI Triumph at ICDAR 2023: Four Championship Wins

ZhongAn Tech Team

Oct 20, 2023 · Artificial Intelligence

Document Analytics & Anti‑Fraud Support Platform for Hong Kong Virtual Banking

This article describes the design and implementation of a Document Analytics & Anti‑Fraud Support platform for Hong Kong virtual banking, detailing its OCR/NLP‑driven pipeline, dynamic rule engine, multi‑template PDF processing, model training, and the resulting improvements in fraud detection and operational efficiency.

NLPOCRanti-fraud

0 likes · 18 min read

Document Analytics & Anti‑Fraud Support Platform for Hong Kong Virtual Banking

Bilibili Tech

Oct 13, 2023 · Artificial Intelligence

Multimodal Video High‑Energy Segment Extraction for Dynamic Video Covers

The authors present a multimodal system that automatically extracts high‑energy video segments for dynamic covers by analyzing subtitles, audio, visual frames, and danmu, employing LLM prompt‑tuning, scene‑cut detection, and aesthetic scoring to reduce manual effort and boost click‑through rates.

ASRLarge Language ModelMultimodal AI

0 likes · 14 min read

Multimodal Video High‑Energy Segment Extraction for Dynamic Video Covers

Rare Earth Juejin Tech Community

Aug 16, 2023 · Artificial Intelligence

Deep Dive into OCR – Chapter 2: Development and Classification of OCR Technology

This article provides a comprehensive overview of OCR technology, detailing the evolution from traditional hand‑crafted methods to modern deep‑learning approaches, describing image preprocessing, text detection and recognition pipelines, summarizing classic machine‑learning algorithms, and presenting a practical OpenCV implementation with Python code.

OCRPythoncomputer vision

0 likes · 23 min read

Deep Dive into OCR – Chapter 2: Development and Classification of OCR Technology

Rare Earth Juejin Tech Community

Aug 12, 2023 · Artificial Intelligence

An Introduction to OCR: Concepts, History, Applications, Datasets, and Technical Workflow

This article provides a comprehensive overview of Optical Character Recognition (OCR), covering its definition, historical development, classification, real‑world applications, technical pipeline, common challenges, mitigation strategies, popular datasets, model performance comparisons, and leading open‑source platforms.

OCROptical Character Recognitioncomputer vision

0 likes · 16 min read

An Introduction to OCR: Concepts, History, Applications, Datasets, and Technical Workflow

Rare Earth Juejin Tech Community

Jul 27, 2023 · Artificial Intelligence

Implementing Text‑Based Image Search Using OCR, Transformers, and Vector Databases

This article explains how to build a text‑to‑image search system by first extracting text with OCR, then storing image paths and textual embeddings in a SQLite or Milvus vector database, and finally improving retrieval with Transformer‑based sentence embeddings and image‑captioning models.

MilvusOCRPython

0 likes · 16 min read

Implementing Text‑Based Image Search Using OCR, Transformers, and Vector Databases

Test Development Learning Exchange

Jul 24, 2023 · Backend Development

Automating Arithmetic Captcha Solving with Python, Requests, pytesseract, and Selenium

This guide explains how to programmatically download arithmetic captcha images, use OCR to extract and compute the expression, and automatically click the correct image on a website by combining Python requests, pytesseract, and Selenium for web automation.

OCRPythonSelenium

0 likes · 8 min read

Automating Arithmetic Captcha Solving with Python, Requests, pytesseract, and Selenium

Test Development Learning Exchange

Jul 16, 2023 · Artificial Intelligence

Comparing Python OCR Libraries: pyocr, pytesseract, and python‑tesseract for Interface Automation

This article compares three popular Python OCR libraries—pyocr, pytesseract, and python‑tesseract—explaining their installation, basic usage, and how they can be applied in interface automation tasks to extract text from images, with code examples for each.

OCRPythonpyocr

0 likes · 6 min read

Comparing Python OCR Libraries: pyocr, pytesseract, and python‑tesseract for Interface Automation

php Courses

Jun 29, 2023 · Backend Development

How to Extract Text from Images Using PHP and Tesseract OCR

This tutorial demonstrates how to install the Tesseract OCR library via Composer, set up a PHP script to load an image, create a TesseractOCR instance, run the OCR process, and output the extracted text, providing complete sample code for each step.

OCRbackendimage-processing

0 likes · 3 min read

How to Extract Text from Images Using PHP and Tesseract OCR

Test Development Learning Exchange

Jun 22, 2023 · Frontend Development

Building a Chrome Extension for Image OCR Using Python and Tesseract

This tutorial walks through creating a Chrome extension that captures images from web pages, sends them to a Python‑backed Tesseract OCR engine, and displays the recognized text, covering the plugin's file structure, manifest configuration, JavaScript code, HTML UI, CSS styling, and installation steps.

Chrome ExtensionJavaScriptOCR

0 likes · 7 min read

Building a Chrome Extension for Image OCR Using Python and Tesseract

Python Crawling & Data Mining

Jun 21, 2023 · Backend Development

How to Bypass Captchas with Python Selenium and OCR – A Step‑by‑Step Guide

This article walks through solving Python web‑scraping captcha challenges by using Selenium to capture the image, applying OCR for recognition, and offering alternative request‑based methods, while also addressing common driver version mismatches.

OCRPythoncaptcha

0 likes · 7 min read

How to Bypass Captchas with Python Selenium and OCR – A Step‑by‑Step Guide

High Availability Architecture

Jun 15, 2023 · Artificial Intelligence

InferX Inference Framework: Challenges, Architecture, Optimizations, and Triton Integration

The article presents the background, challenges, and objectives of Bilibili's AI services, introduces the self‑developed InferX inference framework with its quantization and sparsity optimizations, details OCR‑specific enhancements, and describes how integrating InferX with Nvidia Triton dramatically improves throughput, latency, and GPU utilization.

CUDAModel QuantizationOCR

0 likes · 10 min read

InferX Inference Framework: Challenges, Architecture, Optimizations, and Triton Integration

Test Development Learning Exchange

May 22, 2023 · Fundamentals

Python 3 Practical Projects: PDF/Word Conversion, Image Processing, and OCR Tools

This tutorial presents seven Python3 utilities—including PDF‑to‑Word, image‑to‑PDF/Word, image compression, filtering, Excel conversion, and OCR—detailing required libraries, step‑by‑step procedures, and complete code examples to streamline everyday file‑format tasks.

OCRfile conversionimage-processing

0 likes · 7 min read

Python 3 Practical Projects: PDF/Word Conversion, Image Processing, and OCR Tools

DataFunTalk

May 13, 2023 · Artificial Intelligence

Multimedia Content Understanding at Weibo: Video Summarization, Quality Assessment, OCR, Embedding, and CV‑CUDA Optimization

This article presents Weibo's comprehensive multimedia content understanding pipeline, covering video summarization techniques, quality assessment models, OCR advancements, video embedding strategies, and the performance benefits of CV‑CUDA acceleration, while highlighting real‑world applications and engineering trade‑offs.

CV-CUDAEmbeddingOCR

0 likes · 32 min read

Multimedia Content Understanding at Weibo: Video Summarization, Quality Assessment, OCR, Embedding, and CV‑CUDA Optimization

DataFunSummit

Apr 7, 2023 · Artificial Intelligence

Comprehensive Overview of OCR: Types, Models, Pre‑training Techniques, and DIY Pipelines on ModelScope

This article provides a detailed introduction to OCR technology, covering its fundamental concepts, major categories (document, scene, and handwritten OCR), typical processing pipelines, a suite of open‑source models on ModelScope—including detection, recognition, and table OCR—and recent multimodal pre‑training methods such as VLDoc and VLPT.

ModelScopeOCRTable OCR

0 likes · 15 min read

Comprehensive Overview of OCR: Types, Models, Pre‑training Techniques, and DIY Pipelines on ModelScope

Python Programming Learning Circle

Mar 8, 2023 · Artificial Intelligence

Using ddddocr SDK for Captcha Recognition in Python

This article introduces the open‑source ddddocr SDK, demonstrates how to install it and use it in Python to automatically solve three common captcha types—slider, click‑based, and alphanumeric—providing code examples and result explanations for each.

OCRcaptchacomputer vision

0 likes · 4 min read

Using ddddocr SDK for Captcha Recognition in Python

ELab Team

Feb 20, 2023 · Artificial Intelligence

How MegaPortal Brings Stable Diffusion to iOS: A Hands‑On Guide

MegaPortal is an easy‑to‑use AI model loader for Apple devices that lets users configure visual‑block Snippets for tasks such as face‑filtering, Genshin Impact gacha recommendation, and Stable Diffusion image generation, with step‑by‑step tutorials, system requirements, cache clearing, model downloads, and a call for iOS‑dev help.

@snippetAI Model LoaderMegaPortal

0 likes · 20 min read

How MegaPortal Brings Stable Diffusion to iOS: A Hands‑On Guide

DataFunSummit

Jan 23, 2023 · Artificial Intelligence

Intelligent Document Processing: Core Technologies, Techniques, and Practical Insights

This article explains intelligent document processing (IDP) by describing its core components—OCR, document parsing, and information extraction—detailing various OCR and text‑detection algorithms, discussing document layout reconstruction, table parsing, domain‑specific model adaptation, system optimization, and productization challenges, and outlining future research directions.

AIDocument ParsingIntelligent Document Processing

0 likes · 27 min read

Intelligent Document Processing: Core Technologies, Techniques, and Practical Insights

Laiye Technology Team

Dec 16, 2022 · Artificial Intelligence

Efficient Production of Scene-specific OCR Models Using an AI Platform

This article explains how a unified AI platform enables rapid, data‑driven creation, training, deployment, and evaluation of OCR models for visually distinct text regions such as seals, meter readings, license plates, and VIN codes, while minimizing hardware and annotation costs.

AI platformKubeflowModel Training

0 likes · 7 min read

Efficient Production of Scene-specific OCR Models Using an AI Platform

Tencent Cloud Developer

Dec 12, 2022 · Artificial Intelligence

Performance Optimization of Tencent Cloud OCR Service: Reducing Latency and Improving Throughput

Tencent Cloud’s OCR team cut average response time from 1.8 seconds to under one second and boosted throughput by over 50 % by redesigning the model with self‑attention, accelerating inference with a Tensor‑Network accelerator, shrinking RPC payloads, enabling asynchronous logging, and optimizing multi‑region GPU memory utilization.

AI modelCloud ServicesLatency Reduction

0 likes · 13 min read

Performance Optimization of Tencent Cloud OCR Service: Reducing Latency and Improving Throughput

Laiye Technology Team

Nov 23, 2022 · Artificial Intelligence

Design and Practices of a Data‑Driven OCR Testing System

The article describes Laiye's shift to a data‑driven deep‑learning workflow and presents the design, macro‑ and micro‑analysis features, visual diff tools, distributed tracing, and code examples of their OCR testing system that accelerate model evaluation and iterative optimization.

AIData‑DrivenMLOps

0 likes · 11 min read

Design and Practices of a Data‑Driven OCR Testing System

Shopee Tech Team

Nov 10, 2022 · Artificial Intelligence

ShopeeVideo OCR: Multi-language Text Recognition System for E-commerce Video

ShopeeVideo OCR is a multi‑language text‑recognition system for Southeast Asian e‑commerce videos that unifies detection, Transformer‑based recognition, layout analysis, and large‑scale synthetic data generation to handle Indonesian, Filipino, English, Vietnamese, Thai and Chinese scripts, delivering industry‑leading accuracy and winning thirteen ICDAR first‑place awards.

Data SynthesisMulti-language OCROCR

0 likes · 15 min read

ShopeeVideo OCR: Multi-language Text Recognition System for E-commerce Video

DataFunTalk

Nov 10, 2022 · Artificial Intelligence

A Comprehensive Overview of OCR Technology Development and Engineering Practices

This article reviews the 40‑year evolution of Optical Character Recognition, discusses its integration with Intelligent Document Processing, outlines recent research hotspots such as scene text recognition and domain‑specific symbol detection, and shares practical engineering experiences and future directions from Datagrand.

Document processingIntelligent Document ProcessingOCR

0 likes · 24 min read

A Comprehensive Overview of OCR Technology Development and Engineering Practices

Zhuanzhuan Tech

Nov 9, 2022 · Artificial Intelligence

Applying OCR to Game Skin Recognition: Filtering Owned Skins and Tolerant Text Matching

This article describes how OCR technology is used in a game marketplace to automatically extract skin parameters from user‑uploaded images, outlines methods for separating owned skin regions from background using color analysis, and presents a tolerant matching solution based on Rabin‑Karp hashing to handle OCR errors.

Game DevelopmentImage processingJava

0 likes · 10 min read

Applying OCR to Game Skin Recognition: Filtering Owned Skins and Tolerant Text Matching

Baidu Geek Talk

Oct 17, 2022 · Artificial Intelligence

OCR Technology: PaddleOCR and Paddle.js Integration

The article explains OCR fundamentals and details how Baidu’s open‑source PaddleOCR suite can be converted and run in browsers via the @paddlejs‑models/ocr SDK, describing model initialization, detection and CRNN‑based recognition pipelines, and presenting benchmark results that show the newer ch_PP‑OCRv2 model achieving higher accuracy and faster inference than the mobile variant.

AIOCRPaddle.js

0 likes · 9 min read

OCR Technology: PaddleOCR and Paddle.js Integration

Rare Earth Juejin Tech Community

Oct 10, 2022 · Artificial Intelligence

Practical Guide to OCR Text Recognition, Message Push, Image Processing, and Android UI Dump for Automation

This tutorial walks through OCR fundamentals using pytesseract and chineseocr_lite, demonstrates how to push notifications via Server酱, provides reusable Python image‑processing utilities, and shows how to dump and parse Android UI XML for automated interactions.

AndroidAutomationImageProcessing

0 likes · 18 min read

Practical Guide to OCR Text Recognition, Message Push, Image Processing, and Android UI Dump for Automation

HaoDF Tech Team

Oct 8, 2022 · Artificial Intelligence

Exploring Transformer Technology and Its Applications in NLP, Computer Vision, and OCR at Haodf.com

This article introduces the Transformer architecture, explains its attention mechanism, details its adaptations for natural language processing, computer vision, and OCR tasks, and presents experimental results of various models such as BERT, ELECTRA, Swin Transformer, and CRNN-BCN on large-scale medical data from Haodf.com.

NLPOCRSwin Transformer

0 likes · 39 min read

Exploring Transformer Technology and Its Applications in NLP, Computer Vision, and OCR at Haodf.com

DataFunSummit

Sep 6, 2022 · Artificial Intelligence

Recent Advances in Self‑Supervised Learning for Text Recognition (OCR)

This article reviews recent progress in applying self‑supervised learning to OCR text recognition, covering mainstream model architectures, key considerations for self‑supervised tasks on text images, and detailed analyses of representative papers such as SeqCLR, SimAN, and DiG, highlighting their designs, experiments, and results.

OCRcomputer visioncontrastive learning

0 likes · 20 min read

Recent Advances in Self‑Supervised Learning for Text Recognition (OCR)

DevOps

Aug 23, 2022 · Artificial Intelligence

Intelligent Automation Testing: Self‑Healing and Machine‑Learning Techniques

This article reviews the evolution of automated testing toward intelligent solutions, explaining self‑healing mechanisms, machine‑learning‑driven object recognition, computer‑vision and OCR approaches, industry tools such as Healenium and Airtest, and future prospects for zero‑code AI‑powered test automation.

AIAutomation testingOCR

0 likes · 13 min read

Intelligent Automation Testing: Self‑Healing and Machine‑Learning Techniques

Python Crawling & Data Mining

Aug 18, 2022 · Artificial Intelligence

How to Quickly Extract Text from Images in Python Using ddddocr and OpenCV

This article walks through a Python OCR solution for a blank image output problem, demonstrates a working ddddocr code snippet, and suggests an alternative OpenCV preprocessing step, providing clear screenshots and concise explanations for effective image text extraction.

Image processingOCRcomputer vision

0 likes · 3 min read

How to Quickly Extract Text from Images in Python Using ddddocr and OpenCV

Laiye Technology Team

Aug 15, 2022 · Artificial Intelligence

Recent Advances in Self‑Supervised Learning for Text Recognition

This article reviews recent self‑supervised learning approaches for optical character recognition, covering mainstream OCR model architectures, key factors for applying contrastive and masked image modeling methods to text images, and detailed analyses of representative works such as SeqCLR, SimAN, and DiG, including their designs and experimental results.

OCRcontrastive learningmasked image modeling

0 likes · 19 min read

Recent Advances in Self‑Supervised Learning for Text Recognition

DataFunSummit

Jul 27, 2022 · Artificial Intelligence

Intelligent Creative Advertising: Content Understanding, Generation, and Distribution at JD.com

This article presents JD.com's end‑to‑end intelligent creative system, covering the background of content‑driven e‑commerce, a multi‑stage content understanding pipeline, AI‑powered video, image and copy generation, multimodal creative selection and distribution, and real‑world business impact.

AIAdvertisingMultimodal

0 likes · 27 min read

Intelligent Creative Advertising: Content Understanding, Generation, and Distribution at JD.com

Python Programming Learning Circle

Jul 21, 2022 · Artificial Intelligence

Building an Automatic Math Problem Grading System with Python and Convolutional Neural Networks

This tutorial explains how to generate synthetic digit images, train a CNN model to recognize handwritten numbers and operators, segment scanned math worksheets using projection techniques, evaluate each expression with Python's eval, and overlay the results on the original image to provide automatic grading feedback.

AutomationCNNOCR

0 likes · 26 min read

Building an Automatic Math Problem Grading System with Python and Convolutional Neural Networks

Sohu Tech Products

Jul 20, 2022 · Mobile Development

Building a Mobile Paper‑Reading App with OpenCV OCR and Text‑to‑Speech

A middle‑aged Android developer recounts breaking his child's "Niu Ting Ting" device, then details how he recreated its functionality by integrating OpenCV‑based paper detection, OCR, and TTS into a mobile app, complete with code snippets and performance results.

AndroidImage processingMobile Development

0 likes · 14 min read

Building a Mobile Paper‑Reading App with OpenCV OCR and Text‑to‑Speech

Laiye Technology Team

Jul 16, 2022 · Artificial Intelligence

Seal (Stamp) Recognition in Intelligent Document Processing: Challenges, Methods, and Experiments

This article explains how intelligent document processing uses deep‑learning‑based seal detection and OCR techniques—enhanced YOLOv5, multi‑label loss, combined NMS, and end‑to‑end models such as Mask‑TextSpotter, ABCNet, PGNet, and TrOCR—to overcome diverse stamp styles, background interference, and image quality issues, presenting experimental results that surpass commercial OCR vendors.

AIDocument processingOCR

0 likes · 13 min read

Seal (Stamp) Recognition in Intelligent Document Processing: Challenges, Methods, and Experiments

MaGe Linux Operations

Jul 3, 2022 · Backend Development

How to Automate 10,000 Video‑Channel Posts with Python and OCR for Massive Traffic

This guide shows how to use Python to scrape high‑quality chat screenshots, apply OCR, generate silent chat videos, batch‑download matching audio from short‑video platforms, and combine them into thousands of unique WeChat Video Channel clips, leveraging volume to outsmart recommendation algorithms and boost traffic.

AutomationOCRPython

0 likes · 11 min read

How to Automate 10,000 Video‑Channel Posts with Python and OCR for Massive Traffic

Python Programming Learning Circle

Apr 27, 2022 · Fundamentals

Python Parking Lot Management Application – Project Structure and Core Code

This article documents a Python parking lot management application, detailing its project directory layout, describing each module such as button handling, OCR, and time utilities, and presenting key pygame-based code that renders parking statistics and vehicle information from an Excel file.

ExcelOCRdata-processing

0 likes · 5 min read

Python Parking Lot Management Application – Project Structure and Core Code

Programmer DD

Apr 18, 2022 · Artificial Intelligence

Unlocking Captcha Secrets: How the Open‑Source ddddocr Python Library Works

This article introduces the open‑source Python library ddddocr, explains its evolution from version 1.2.0 to 1.4.3—including OCR, target detection, and slider recognition features—and shows how it leverages deep‑learning and OpenCV to simplify captcha solving for developers.

OCRcaptchadeep learning

0 likes · 4 min read

Unlocking Captcha Secrets: How the Open‑Source ddddocr Python Library Works

DataFunTalk

Apr 5, 2022 · Artificial Intelligence

Applying AI Technologies in the Youdao Dictionary Pen: Scanning, Offline Translation, and Edge ML Library

This article presents a technical overview of the Youdao Dictionary Pen, describing its hardware design, real‑time scanning and point‑query image processing, on‑device offline translation with model compression techniques, and the high‑performance Edge ML Library (EMLL) that enables efficient AI inference on constrained edge hardware.

AIEdge ML LibraryOCR

0 likes · 18 min read

Applying AI Technologies in the Youdao Dictionary Pen: Scanning, Offline Translation, and Edge ML Library

NetEase LeiHuo Testing Center

Apr 1, 2022 · Artificial Intelligence

Learning OCR for Game Text Recognition: From Data Preparation to CRNN Model Training

This article documents the author’s step‑by‑step journey of building an OCR system for recognizing Chinese characters in a card‑game UI, covering game selection, technical background, data generation, deep‑learning model training with CRNN, real‑image data collection, optimization attempts, and final performance evaluation.

CRNNData AugmentationEasyOCR

0 likes · 15 min read

Learning OCR for Game Text Recognition: From Data Preparation to CRNN Model Training

Laiye Technology Team

Mar 25, 2022 · Artificial Intelligence

Laiye OCR Error‑Correction Model: Architecture, Implementation, and Evaluation

This article describes Laiye's OCR error‑correction system, detailing the background challenges of Chinese character recognition, the analysis of three possible solutions, the chosen post‑processing approach, model architecture, training data, loss design, online inference, and experimental results showing a measurable performance boost.

Chinese textError CorrectionOCR

0 likes · 13 min read

Laiye OCR Error‑Correction Model: Architecture, Implementation, and Evaluation

Python Programming Learning Circle

Mar 3, 2022 · Artificial Intelligence

Ten‑Line Python Projects: QR Code, Word Cloud, Image Segmentation, Sentiment Analysis, Mask Detection, Message Spam, OCR, and a Simple Game

This article presents a series of concise Python examples—each under ten lines—demonstrating how to generate QR codes, create word clouds, perform image segmentation, conduct sentiment analysis, detect masks, automate message sending, extract text with OCR, and build a basic number‑guessing game, showcasing the versatility of Python for quick prototyping across AI and utility tasks.

GameOCRQR code

0 likes · 10 min read

Ten‑Line Python Projects: QR Code, Word Cloud, Image Segmentation, Sentiment Analysis, Mask Detection, Message Spam, OCR, and a Simple Game

DataFunSummit

Jan 5, 2022 · Artificial Intelligence

Improving Financial Micro‑Business Efficiency with OCR: Challenges, Applications, and an Intelligent Platform

This article explores how optical character recognition (OCR) technology can address the financing pain points of micro‑enterprises by automating document verification, enhancing risk assessment, and enabling an end‑to‑end intelligent OCR platform built on deep‑learning models, data pipelines, and deployment automation.

Document AutomationMicro BusinessOCR

0 likes · 15 min read

Improving Financial Micro‑Business Efficiency with OCR: Challenges, Applications, and an Intelligent Platform

Yiche Technology

Jan 4, 2022 · Artificial Intelligence

Yiche OCR System: Architecture, Data Expansion, Multi‑Branch Optimization, and Server Migration

The Yiche OCR system combines a DBNet‑based text detector and a CRNN recognizer, enhances performance on natural‑scene texts through data expansion, multi‑branch dictionaries, distribution‑aware weighting, and accelerates training via IPEX and parallel processing on CPU servers.

CRNNDBNetOCR

0 likes · 11 min read

Yiche OCR System: Architecture, Data Expansion, Multi‑Branch Optimization, and Server Migration

Laiye Technology Team

Dec 31, 2021 · Artificial Intelligence

Overview of Table Recognition Techniques and Practical Implementation

This article reviews the challenges of extracting structured table data from images, compares two‑stage and end‑to‑end OCR approaches, evaluates four state‑of‑the‑art table‑recognition models (SPLERGE, CascadeTabNet, TableMASTER, UnetTable), and presents a practical deployment workflow with performance metrics.

AIOCRStructured Data

0 likes · 14 min read

Overview of Table Recognition Techniques and Practical Implementation

MaGe Linux Operations

Dec 29, 2021 · Mobile Development

Automate Princess Connect with Python, ADB, and OpenCV: A Step‑by‑Step Guide

This tutorial shows how to use Python, ADB, and OpenCV to automate the mobile game Princess Connect, covering environment setup, device communication, screen capture, image matching, OCR, and script snippets for clicking, typing, and switching accounts.

ADBGame ScriptingImage processing

0 likes · 9 min read

Automate Princess Connect with Python, ADB, and OpenCV: A Step‑by‑Step Guide

Python Crawling & Data Mining

Dec 17, 2021 · Artificial Intelligence

Decoding Randomized Custom Fonts with Python: Glyph Matching and OCR Techniques

This article explains how to handle custom web fonts whose glyph order or shapes are randomized by extracting glyph metadata with FontTools, creating binary signatures for reliable matching, and applying image‑recognition OCR to decode characters when glyph contours also change, complete with code examples and step‑by‑step instructions.

OCRcustom fontsfontTools

0 likes · 32 min read

Decoding Randomized Custom Fonts with Python: Glyph Matching and OCR Techniques

Baidu App Technology

Dec 7, 2021 · Artificial Intelligence

Paddle.js OCR SDK: Text Recognition in Web Browsers

Paddle.js OCR SDK brings Baidu’s lightweight PaddleOCR models to web browsers, offering init() and recognize() APIs that load the ch_PP-OCRv2 detection (DB) and recognition (CRNN with bidirectional LSTM) models in parallel, achieving 258 ms detection, 60 ms recognition, 0.52 F‑score, and a combined size under 12 MB.

AIOCRPaddle.js

0 likes · 7 min read

Paddle.js OCR SDK: Text Recognition in Web Browsers

Python Crawling & Data Mining

Nov 26, 2021 · Fundamentals

How to Bypass Anti‑Scraping Defenses and Extract Hidden Prices with Selenium and OCR

This article demonstrates step‑by‑step how to overcome a website’s anti‑scraping defenses using Selenium with stealth options, retrieve CSS‑based price images, reconstruct the digits, and apply Tesseract OCR to accurately extract numeric data, providing complete Python code snippets throughout.

Image processingOCRSelenium

0 likes · 12 min read

How to Bypass Anti‑Scraping Defenses and Extract Hidden Prices with Selenium and OCR

Cyber Elephant Tech Team

Oct 14, 2021 · Artificial Intelligence

Mastering OCR: From Traditional Techniques to Deep Learning Solutions

This article provides a comprehensive overview of Optical Character Recognition, covering its traditional applications, the evolution to deep learning methods, key datasets, popular tools, and practical strategies for tackling diverse OCR challenges in real-world scenarios.

CRNNEASTOCR

0 likes · 18 min read

Mastering OCR: From Traditional Techniques to Deep Learning Solutions

DataFunTalk

Sep 29, 2021 · Artificial Intelligence

Self‑Supervised Learning and Contrastive Learning for Computer Vision and OCR Applications

This article reviews self‑supervised learning techniques, common computer‑vision pretext tasks, contrastive loss functions, popular frameworks such as SimCLR, MoCo and SimSiam, and demonstrates their application to OCR captcha recognition with detailed implementation and experimental results.

OCRPyTorchTensorFlow

0 likes · 22 min read

Self‑Supervised Learning and Contrastive Learning for Computer Vision and OCR Applications

Laiye Technology Team

Sep 24, 2021 · Artificial Intelligence

Self‑Supervised Learning and Contrastive Methods for Computer Vision and OCR Applications

This article surveys self‑supervised learning techniques for computer‑vision tasks, explains common pretext tasks and contrastive loss designs, reviews representative models such as SimCLR, MoCo, SmAV and SimSiam, and demonstrates their practical impact on a captcha‑OCR system with measurable accuracy gains.

OCRSimCLRSimSiam

0 likes · 23 min read

Self‑Supervised Learning and Contrastive Methods for Computer Vision and OCR Applications

DataFunTalk

Sep 21, 2021 · Artificial Intelligence

Text Recognition Techniques for Content Safety: Risks, Workflow, Algorithms, and Deployment Optimization

This article explains how OCR-based text recognition is applied to content safety, detailing common risk categories, a step‑by‑step detection and recognition pipeline, mainstream detection and recognition algorithms such as regression‑based and segmentation‑based methods, and practical deployment and performance optimization strategies.

AIContent SafetyOCR

0 likes · 15 min read

Text Recognition Techniques for Content Safety: Risks, Workflow, Algorithms, and Deployment Optimization

Baidu Geek Talk

Sep 8, 2021 · Artificial Intelligence

How PP‑OCRv2 Boosts OCR Speed and Accuracy with Five Key Innovations

The article provides a comprehensive technical overview of PaddleOCR's PP‑OCRv2, detailing its five major algorithmic enhancements, performance improvements over previous versions, historical milestones, core capabilities, and links to the open‑source repositories for developers interested in state‑of‑the‑art OCR solutions.

Data AugmentationModel OptimizationOCR

0 likes · 10 min read

How PP‑OCRv2 Boosts OCR Speed and Accuracy with Five Key Innovations

Python Crawling & Data Mining

Aug 25, 2021 · Artificial Intelligence

Quickly Solve Captchas with the Lightweight ddddocr Python Library

This article introduces the ddddocr Python library for fast, code‑light captcha recognition, compares it with pillow + pytesseract and Baidu API, provides installation steps, usage examples, performance tips, and discusses its accuracy limits.

OCRcaptchacomputer vision

0 likes · 4 min read

Quickly Solve Captchas with the Lightweight ddddocr Python Library

ByteDance SE Lab

Jul 23, 2021 · Mobile Development

How to Accurately Measure Mobile App Response Time Using Video Frame Detection and OCR

This article presents a method for precisely measuring mobile app response latency by extracting video frames, detecting start and end frames through image markers and OCR, and calculating the time difference, offering a high‑precision, customizable solution for performance evaluation across diverse app scenarios.

OCRapp latencyframe detection

0 likes · 12 min read

How to Accurately Measure Mobile App Response Time Using Video Frame Detection and OCR

MaGe Linux Operations

Jul 13, 2021 · Artificial Intelligence

Build a Batch Image Translation Tool with Youdao OCR API in Python

This article walks through creating a Python desktop demo that uses Youdao's OCR translation API to batch‑process cosmetic product label images, covering API credential setup, request parameters, signature generation, core code snippets, and a summary of the translation results.

APIOCRPython

0 likes · 10 min read

Build a Batch Image Translation Tool with Youdao OCR API in Python

Python Programming Learning Circle

Jul 3, 2021 · Artificial Intelligence

Automatic PDF Slide Transcription Using Deep Learning OCR

This article demonstrates how to automatically convert PDF slide decks into editable markdown text by first converting each page to images, then applying a deep‑learning OCR pipeline (CTPN for detection and CRNN for recognition) with Python code examples, achieving high transcription accuracy.

Image processingOCRPDF conversion

0 likes · 6 min read

Automatic PDF Slide Transcription Using Deep Learning OCR

TiPaiPai Technical Team

Jun 28, 2021 · Artificial Intelligence

How Deep Learning Unwarps Twisted Document Images: DocUNet & DewarpNet Explained

This article reviews two end‑to‑end deep‑learning approaches—DocUNet (CVPR 2018) and DewarpNet (ICCV 2019)—for correcting warped document images, detailing their network architectures, synthetic data generation, loss functions, experimental results, and the remaining challenges in document dewarping.

Image processingOCRcomputer vision

0 likes · 14 min read

How Deep Learning Unwarps Twisted Document Images: DocUNet & DewarpNet Explained

Python Programming Learning Circle

Jun 25, 2021 · Artificial Intelligence

Batch Image Translation Demo Using Youdao OCR API with Python

This article presents a step‑by‑step Python demo that uses Youdao's OCR translation API to batch‑process cosmetic product images, covering API key setup, request parameters, signature generation, GUI implementation with Tkinter, and code snippets for file selection, result storage, and API invocation.

AIBatch ProcessingOCR

0 likes · 10 min read

Batch Image Translation Demo Using Youdao OCR API with Python

TiPaiPai Technical Team

Jun 18, 2021 · Artificial Intelligence

Mastering Text Recognition: Encoder & Decoder Strategies Explained

This article reviews modern text‑recognition systems, detailing how encoders such as CNN, CNN‑BiLSTM, and Transformer‑based models extract visual features, and how decoders like Position Attention, Transformer decoders, and RNN Seq2Seq align variable‑length text, while also discussing CTC loss and practical design choices.

CNNCTCEncoder

0 likes · 9 min read

Mastering Text Recognition: Encoder & Decoder Strategies Explained

TiPaiPai Technical Team

Jun 17, 2021 · Artificial Intelligence

From Pixels to Words: The Evolution and Challenges of Text Detection

This article traces the origins, unique difficulties, method classifications, and current advancements of scene text detection, highlighting how AI has enabled computers to read images and the ongoing research to improve accuracy, speed, and multilingual support.

AIOCRcomputer vision

0 likes · 8 min read

From Pixels to Words: The Evolution and Challenges of Text Detection

Python Crawling & Data Mining

Jun 13, 2021 · Artificial Intelligence

How to Crack Different Captcha Types with Python OCR and Selenium

This article explains the main captcha varieties—input, sliding, grid, and click‑based—and provides step‑by‑step Python solutions using OCR libraries, image preprocessing, and Selenium automation to bypass them effectively.

Image processingOCRPython

0 likes · 14 min read

How to Crack Different Captcha Types with Python OCR and Selenium

Xianyu Technology

Jun 3, 2021 · Mobile Development

Extending Flutter UI Automation: Analysis of Flutter Driver, Integration Test, and Xianyu's Hybrid Approach

The article explains that Flutter Driver and Integration Test struggle to locate elements in hybrid native‑Flutter apps, then describes Xianyu’s approach of extending native UI automation with OCR, image‑matching, and a layered page‑object architecture, achieving over 98% success across 500+ runs.

FlutterImage processingOCR

0 likes · 9 min read

Extending Flutter UI Automation: Analysis of Flutter Driver, Integration Test, and Xianyu's Hybrid Approach

TiPaiPai Technical Team

May 21, 2021 · Artificial Intelligence

How AI Powers Automatic Homework Grading: Challenges and Solutions

Automatic homework grading leverages AI to transform captured images into graded results through preprocessing, layout analysis, OCR, answer matching, and strategy modules, while addressing three question categories—logical, text‑rich, and graphic—each presenting distinct technical challenges and future research directions.

AIEducation TechnologyImage processing

0 likes · 7 min read

How AI Powers Automatic Homework Grading: Challenges and Solutions

iQIYI Technical Product Team

Mar 26, 2021 · Artificial Intelligence

Insights into OCR Technology at iQIYI: Development, Challenges, and Applications

iQIYI’s OCR journey, explained by researcher Harlon, covers the evolution from separate detection and recognition pipelines to end‑to‑end models, key algorithms like CTPN, DB and CRNN, large‑scale simulated training, diverse video‑text applications, and future goals such as mobile deployment and tighter NLP integration.

AIOCRPaddleOCR

0 likes · 21 min read

Insights into OCR Technology at iQIYI: Development, Challenges, and Applications

Huawei Cloud Developer Alliance

Mar 23, 2021 · Artificial Intelligence

How to Recognize Credit Card Numbers with OpenCV: A Step‑by‑Step Tutorial

This tutorial walks through a project‑based OpenCV workflow that reads a digit template, preprocesses both template and credit‑card images, extracts individual numbers, matches them against the template, and finally overlays the recognized digits onto the original image, illustrating core computer‑vision techniques.

Image processingOCRPython

0 likes · 10 min read

How to Recognize Credit Card Numbers with OpenCV: A Step‑by‑Step Tutorial

Amap Tech

Mar 22, 2021 · Artificial Intelligence

Visual Technology for Automated POI Name Generation: STR, Text Detection, and Naming Practices

Amap’s visual‑technology pipeline automatically generates and updates POI names by crowdsourcing street‑level images, applying deep‑learning scene‑text recognition, dual‑branch classification of text attributes, and a BERT‑plus‑graph‑attention model that selects and orders recognized text, achieving about 95 % naming accuracy.

Name GenerationOCRPOI

0 likes · 14 min read

Visual Technology for Automated POI Name Generation: STR, Text Detection, and Naming Practices

58 Tech

Mar 17, 2021 · Artificial Intelligence

Practical Applications of OCR Technology in 58 Information Security Scenarios: Layout Analysis

This article presents the practical deployment of OCR technology within 58’s information‑security workflows, focusing on layout‑analysis techniques for document and credential recognition, detailing rule‑based, template‑matching, object‑detection, and image‑segmentation methods, their implementation steps, experimental results, advantages, limitations, and future directions.

Document RecognitionLayout AnalysisOCR

0 likes · 18 min read

Practical Applications of OCR Technology in 58 Information Security Scenarios: Layout Analysis

21CTO

Mar 12, 2021 · Artificial Intelligence

How AI Is Revolutionizing Buddhist Scriptures: Automatic Punctuation, OCR, and Translation

This article describes how a team at Longquan Temple leverages deep‑learning, NLP, and OCR technologies to automatically punctuate, recognize, and translate the massive Buddhist canon, achieving near‑human accuracy and dramatically accelerating scholarly work on ancient texts.

AIAutomatic PunctuationBuddhist Texts

0 likes · 10 min read

How AI Is Revolutionizing Buddhist Scriptures: Automatic Punctuation, OCR, and Translation

Java Architect Essentials

Mar 7, 2021 · Artificial Intelligence

ID Card OCR Project Using JavaCPP, OpenCV, and Tess4j

This article describes a Java-based ID card number recognition project that integrates Tess4j with JavaCPP to leverage OpenCV functionality without requiring a separate OpenCV installation, outlines required software, troubleshooting steps, recent updates, and provides the project repository link.

ID CardJavaCPPOCR

0 likes · 4 min read

ID Card OCR Project Using JavaCPP, OpenCV, and Tess4j

Tencent Cloud Developer

Mar 4, 2021 · Artificial Intelligence

WeChat OCR: Implementation of Image Text Extraction Feature

WeChat’s 8.0 update introduced an OCR pipeline that first quickly detects text in images, classifies the image type, applies a lightweight multi‑language detection network and a MobileNetV3‑based DBNet recognizer with a multi‑task CTC/Attention model, then merges results via a rule‑based layout analyzer to deliver accurate, well‑formatted extracted text across diverse languages and document types.

DBNetLayout AnalysisOCR

0 likes · 13 min read

WeChat OCR: Implementation of Image Text Extraction Feature

DataFunTalk

Feb 16, 2021 · Artificial Intelligence

Multimedia Content Understanding in Meitu Community: Video Classification, Fingerprinting, and OCR

This article presents Meitu Community's AI‑driven multimedia content analysis pipeline, covering short‑video classification, video fingerprinting, and OCR, detailing model choices, experimental results, and future directions for improving content audit, quality, tagging, and feature engineering.

AIFingerprintingMeitu

0 likes · 18 min read

Multimedia Content Understanding in Meitu Community: Video Classification, Fingerprinting, and OCR

DataFunTalk

Feb 12, 2021 · Artificial Intelligence

PlugNet: A Plug‑in Super‑Resolution Unit for Low‑Quality Text Recognition in Natural Scene OCR

This article introduces ImageDT's PlugNet, which combines deep‑learning OCR and super‑resolution techniques to improve low‑quality text recognition in natural scenes, detailing the company's background, OCR challenges, deep‑learning approaches, super‑resolution methods, the PlugNet architecture, experimental results, and future research directions.

AILow-Quality TextOCR

0 likes · 16 min read

PlugNet: A Plug‑in Super‑Resolution Unit for Low‑Quality Text Recognition in Natural Scene OCR

php Courses

Feb 4, 2021 · Information Security

Analyzing and Decoding CAPTCHA Images Using PHP

This article explains how to extract RGB values from a CAPTCHA image with PHP, convert the pixel data into binary patterns, map those patterns to digits using a predefined dictionary, and achieve 100% recognition accuracy, illustrating a practical backend security technique.

Image processingOCRPHP

0 likes · 4 min read

Analyzing and Decoding CAPTCHA Images Using PHP

HaoDF Tech Team

Feb 2, 2021 · Artificial Intelligence

AI‑Based Structuring of Medical Examination Reports: OCR, Text Detection, Classification, and NER

This article describes how a Chinese online medical platform tackled the large‑scale extraction and structuring of hospital report images by combining OCR, deep‑learning text‑region detection, fast text classification, and advanced NER techniques, detailing challenges, algorithm choices, performance results, and remaining issues.

AINERNLP

0 likes · 19 min read

AI‑Based Structuring of Medical Examination Reports: OCR, Text Detection, Classification, and NER

Python Crawling & Data Mining

Jan 11, 2021 · Artificial Intelligence

Unlock Text from Images: A Hands‑On Guide to EasyOCR in Python

This article explains what OCR is, introduces the EasyOCR Python library, shows how to install it, walks through step‑by‑step usage with code examples, and summarizes the underlying deep‑learning techniques powering the library.

EasyOCROCRPython

0 likes · 6 min read

Unlock Text from Images: A Hands‑On Guide to EasyOCR in Python

Programmer DD

Dec 17, 2020 · Artificial Intelligence

Turn Screenshots into Editable Text Instantly with TextShot – A Simple OCR Tool

TextShot, a newly released open‑source Python utility by GitHub user ianzhao05, lets you capture any screen region and instantly convert the image to editable text using Tesseract OCR, with multilingual support, hotkey integration, and step‑by‑step installation guidance for Windows and Linux.

OCRTextShotmultilingual

0 likes · 6 min read

Turn Screenshots into Editable Text Instantly with TextShot – A Simple OCR Tool

Python Crawling & Data Mining

Dec 10, 2020 · Artificial Intelligence

How to Build a Python OCR & Image Converter with Baidu API and Pillow

Learn step‑by‑step how to use Baidu’s OCR service to extract text from images and employ the Pillow library to convert image formats in Python, including code snippets, API parameter details, and practical tips for handling local and online files.

Baidu APIOCRimage-processing

0 likes · 7 min read

How to Build a Python OCR & Image Converter with Baidu API and Pillow

Top Architect

Dec 4, 2020 · Artificial Intelligence

Java-based ID Card OCR Project Using OpenCV, JavaCPP, and Tess4J

This article introduces a Java OCR project for ID cards that integrates OpenCV, JavaCPP, and Tess4J to perform image preprocessing, region cropping, and character recognition without requiring OpenCV installation, and details its features, encountered issues, system requirements, updates, and source repository.

ID CardJavaJavaCPP

0 likes · 4 min read

Java-based ID Card OCR Project Using OpenCV, JavaCPP, and Tess4J

New Oriental Technology

Nov 23, 2020 · Artificial Intelligence

A Seq2Seq Deep Learning Approach for Recognizing Mathematical Formulas in Images

This article presents a deep‑learning Seq2Seq model that converts images of mathematical formulas—including matrices, equations, fractions, and radicals—into LaTeX sequences with over 95% accuracy, detailing data preparation, LaTeX normalization, model architecture, training, inference, and post‑processing techniques.

Formula RecognitionLaTeXOCR

0 likes · 9 min read

A Seq2Seq Deep Learning Approach for Recognizing Mathematical Formulas in Images

Java Captain

Nov 9, 2020 · Artificial Intelligence

ID Card Number Recognition Project Using JavaCV, OpenCV, and Tess4J

This article introduces a Java-based ID card number recognition project that integrates OpenCV, Tess4J, and JavaCPP to perform OCR without prior training, outlines the encountered library linking issue, lists required software, and details recent updates such as chunked uploads and OpenCV version upgrade.

ID CardJavaOCR

0 likes · 3 min read

ID Card Number Recognition Project Using JavaCV, OpenCV, and Tess4J

DataFunTalk

Sep 23, 2020 · Artificial Intelligence

PaddleOCR: 2020’s Outstanding Open‑Source OCR Suite with a 3.5 MB Ultra‑Light Model

PaddleOCR, the 2020 breakthrough in open‑source OCR, offers ultra‑light 3.5 MB multilingual models, high F1‑score performance across diverse scenarios, easy installation via pip, comprehensive documentation, custom training support, and deployment options for both server and mobile platforms, all backed by detailed benchmarks and code examples.

OCRPaddleOCRPython

0 likes · 8 min read

PaddleOCR: 2020’s Outstanding Open‑Source OCR Suite with a 3.5 MB Ultra‑Light Model