Tagged articles

OCR

241 articles · Page 3 of 3

Jul 30, 2020 · Artificial Intelligence

Evolution and Practice of Scene Text Recognition Technology in Gaode Map Data Production

Gaode Maps leverages advanced scene text recognition, evolving from traditional image processing to deep‑learning based detection and recognition pipelines, integrating multi‑stage models, data augmentation, and synthetic sample generation to achieve high‑accuracy, fast POI and road data automation.

GaodeOCRdeep learning

0 likes · 18 min read

Evolution and Practice of Scene Text Recognition Technology in Gaode Map Data Production

Amap Tech

Jul 30, 2020 · Artificial Intelligence

Evolution and Practice of Scene Text Recognition Technology in Amap Map Data Production

Amap uses advanced scene text recognition combining detection and recognition modules, deep learning, data synthesis, and result fusion to automate map data production, achieving state-of-the-art performance and automating the majority of POI and road updates, significantly reducing labor costs.

OCRcomputer visiondeep learning

0 likes · 18 min read

Evolution and Practice of Scene Text Recognition Technology in Amap Map Data Production

Alibaba Cloud Developer

Jul 30, 2020 · Artificial Intelligence

How Amap’s Scene Text Recognition Powers Accurate Maps: Evolution and Future Challenges

This article explains how Amap leverages scene text recognition to automate map data production, detailing the evolution from traditional image algorithms to deep‑learning models, the current detection and recognition framework, performance results, and future research directions for handling blur, data scarcity, and semantic understanding.

AmapOCRcomputer vision

0 likes · 19 min read

How Amap’s Scene Text Recognition Powers Accurate Maps: Evolution and Future Challenges

Alibaba Cloud Developer

Jul 29, 2020 · Artificial Intelligence

How Gaode Maps Boosts Accuracy with Advanced Scene Text Recognition

This article explains how Gaode Maps leverages traditional and deep‑learning based scene text recognition techniques—including character detection, sequence models, data synthesis, and multi‑stage frameworks—to automate POI and road data production with high precision and speed.

OCRcomputer visiondeep learning

0 likes · 20 min read

How Gaode Maps Boosts Accuracy with Advanced Scene Text Recognition

Programmer DD

Jun 8, 2020 · Artificial Intelligence

Turn Screenshots into Editable Text Instantly with TextShot – A Python OCR Tool

TextShot is a Python-based OCR utility that captures a screen region and instantly converts the image into editable text, leveraging Tesseract and optional language parameters, with installation steps, hotkey integration, and guidance on image preprocessing for improved accuracy.

Image-to-TextOCRPython

0 likes · 7 min read

Turn Screenshots into Editable Text Instantly with TextShot – A Python OCR Tool

ITPUB

Jun 6, 2020 · Artificial Intelligence

How to Use the Open‑Source OCR Translator for Videos, Games, and PDFs

This guide explains how to set up and operate a free open‑source OCR‑based translator that captures on‑screen text from videos, games, or PDFs, registers the required Baidu AI API keys, configures translation sources, and demonstrates its performance on real content.

Baidu AIGitHubOCR

0 likes · 5 min read

How to Use the Open‑Source OCR Translator for Videos, Games, and PDFs

Alibaba Cloud Developer

May 28, 2020 · Artificial Intelligence

Inside Alibaba Cloud’s 5‑Day Vision AI Bootcamp: Projects, Code & Insights

The article recaps Alibaba Cloud’s five‑day Vision AI training camp, detailing the core technologies, two hands‑on projects (a smart photo album and an ID‑card OCR system), student showcase code snippets, personal reflections, and the announcement of the next session.

AIAlibaba CloudBack-end

0 likes · 13 min read

Inside Alibaba Cloud’s 5‑Day Vision AI Bootcamp: Projects, Code & Insights

Programmer DD

May 9, 2020 · Artificial Intelligence

ChineseOCR Lite: Ultra‑Lightweight OCR Engine for Vertical Chinese Text

ChineseOCR Lite is an open‑source, ultra‑lightweight OCR solution that supports vertical Chinese text, runs on Linux/macOS via ncnn inference, and packs detection, recognition, and angle classification models into a total of just 17 MB, offering fast and accurate scene‑text processing.

Chinese OCROCRcomputer vision

0 likes · 4 min read

ChineseOCR Lite: Ultra‑Lightweight OCR Engine for Vertical Chinese Text

Python Programming Learning Circle

Feb 28, 2020 · Artificial Intelligence

TensorFlow CNN for Fixed‑Length ID Card Number OCR

This article demonstrates how to build a TensorFlow‑based CNN to recognize fixed‑length 18‑digit Chinese ID card numbers, covering environment setup, synthetic data generation, model architecture, training procedure, and achieved accuracy of over 84%.

CNNID CardOCR

0 likes · 18 min read

TensorFlow CNN for Fixed‑Length ID Card Number OCR

Python Programming Learning Circle

Feb 14, 2020 · Artificial Intelligence

Using pytesseract for Image‑to‑Text Conversion with Python

This tutorial introduces OCR basics, explains the Tesseract engine, and demonstrates how to install and use the Python pytesseract library to convert images into editable text with just a few lines of code, including practical tips for handling file paths and language settings.

Image-to-TextOCRcomputer-vision

0 likes · 4 min read

Using pytesseract for Image‑to‑Text Conversion with Python

360 Quality & Efficiency

Jan 2, 2020 · Mobile Development

Common Element Locating Strategies in Appium for Mobile Automation

This article introduces Appium's basic element locating techniques—including id, name, class name, XPath, UIAutomator, and relative coordinates—explains how to handle non‑unique elements through iteration or OCR, and demonstrates image‑based locating with OpenCV and screenshot code examples.

AppiumElement LocatingOCR

0 likes · 5 min read

Common Element Locating Strategies in Appium for Mobile Automation

21CTO

Sep 28, 2019 · Backend Development

Cracking Dazhong Dianping’s CSS Encryption: A Step‑by‑Step Web Scraping Guide

This article walks through the challenges of scraping Dazhong Dianping, explains how the site hides numeric data with custom CSS fonts, and provides a complete Python workflow—including HTTP requests, font extraction, glyph rendering, and OCR—to decode and retrieve the protected information.

CSS encryptionOCRPython

0 likes · 13 min read

Cracking Dazhong Dianping’s CSS Encryption: A Step‑by‑Step Web Scraping Guide

Tencent Cloud Developer

Sep 19, 2019 · Artificial Intelligence

Inside Tencent Cloud OCR: Architecture, Performance, and Integration Guide

The article provides a comprehensive overview of Tencent Cloud’s OCR platform, detailing its service architecture, product capabilities, integration methods, performance metrics, engineering improvements, testing automation, and operational considerations, offering developers practical insights into building and deploying OCR solutions on the cloud.

OCRService ArchitectureTencent Cloud

0 likes · 10 min read

Inside Tencent Cloud OCR: Architecture, Performance, and Integration Guide

iQIYI Technical Product Team

Aug 23, 2019 · Mobile Development

iQIYI One‑Click Android App Health Check: Architecture, Implementation and Key Technologies

iQIYI’s one‑click Android app health check provides a lightweight, universal solution that automatically installs, traverses, and analyzes apps on a 100‑device cloud farm using ATX‑based drivers, OCR‑driven UI interaction, deep‑learning UI anomaly detection, static security analysis, performance metrics, and crash/ANR reporting, seamlessly integrating into CI pipelines.

ATXAndroidApp Inspection

0 likes · 19 min read

iQIYI One‑Click Android App Health Check: Architecture, Implementation and Key Technologies

Suning Technology

Aug 2, 2019 · Artificial Intelligence

How Suning’s OCR “Fire Eye” Robot Revolutionized Financial Invoice Processing

Suning’s OCR “Fire Eye” robot transformed a two‑year, zero‑start project into an industry‑leading AI solution that automates invoice code extraction and verification across hundreds of financial document types, dramatically cutting manual effort and boosting accuracy.

Case StudyFinance AutomationOCR

0 likes · 9 min read

How Suning’s OCR “Fire Eye” Robot Revolutionized Financial Invoice Processing

Tencent Cloud Developer

Jun 5, 2019 · Artificial Intelligence

Tencent Cloud OCR Technology: Principles, Challenges, and Industry Applications

Tencent Cloud OCR leverages deep‑learning‑based text detection and recognition, including Compact Inception and multi‑layer RNN refinements, to overcome challenges such as complex backgrounds, low resolution, and multilingual layouts, delivering over 90% accuracy for ID cards, bank cards, business licenses, handwritten text, and powering fast, cost‑saving applications in logistics, QQ, and WeChat Work.

Image processingOCROptical Character Recognition

0 likes · 7 min read

Tencent Cloud OCR Technology: Principles, Challenges, and Industry Applications

Youku Technology

May 13, 2019 · Artificial Intelligence

How Youku Tackles Multimodal Video Understanding and Quality Control

This article outlines Youku's multimodal video content understanding pipeline, covering business needs, problem decomposition, data construction, model selection, OCR subtitle extraction, scene and action recognition, sample augmentation, noise handling, and multimodal fusion strategies for robust content moderation.

AIOCRaction recognition

0 likes · 11 min read

How Youku Tackles Multimodal Video Understanding and Quality Control

MaGe Linux Operations

Apr 9, 2019 · Artificial Intelligence

How to Build and Crack Image Captchas with Python and Tesserocr

This tutorial explains the types of captchas, demonstrates how to generate image captchas using the Claptcha library, outlines preprocessing steps such as grayscale conversion, binarization, and denoising, and shows how to recognize them with the Tesserocr OCR engine, including handling noise and interference lines.

Image processingOCRPython

0 likes · 7 min read

How to Build and Crack Image Captchas with Python and Tesserocr

MaGe Linux Operations

Mar 11, 2019 · Artificial Intelligence

How to Crack Image CAPTCHAs with Python: From PIL to pytesser

This guide explains the fundamentals of CAPTCHA recognition, covering computer graphics basics, image denoising, segmentation, binary conversion, and using Python's PIL and pytesser libraries to perform OCR on captcha images.

OCRPILPython

0 likes · 7 min read

How to Crack Image CAPTCHAs with Python: From PIL to pytesser

Alibaba Cloud Developer

Mar 6, 2019 · Artificial Intelligence

How Deep Learning Unwarps Curved Document Images for Better OCR

This article explores how deep‑learning‑based image dewarping techniques, from traditional hardware methods to modern U‑Net, Stacked U‑Net and Dilated U‑Net architectures, can correct warped document photos, improve OCR accuracy, and support intelligent verification in high‑throughput business scenarios.

OCRU‑Netdeep learning

0 likes · 19 min read

How Deep Learning Unwarps Curved Document Images for Better OCR

Ctrip Technology

Feb 28, 2019 · Artificial Intelligence

OCR Techniques and Solutions for Ctrip Business: Deep Learning Based Text Detection and Recognition

This article presents an overview of computer‑vision based OCR in Ctrip's operations, detailing deep‑learning text detection methods for controlled and uncontrolled scenarios, sequence‑based recognition models, training strategies with synthetic data, and performance results, while discussing current challenges and future improvements.

AICtripOCR

0 likes · 11 min read

OCR Techniques and Solutions for Ctrip Business: Deep Learning Based Text Detection and Recognition

Meituan Technology Team

Feb 28, 2019 · Artificial Intelligence

ICDAR 2019 Chinese Signboard Text Recognition Challenge (Meituan Dataset)

Meituan’s new 25,000‑image Chinese signboard dataset—captured across varied devices, locations, lighting and angles—serves as the benchmark for the ICDAR 2019 Robust Reading Challenge, which features four tasks (end‑to‑end recognition, text‑line localization, single‑character and string recognition) with a 20k/2k/3k train/validation/test split, character‑level annotations, and participation from academic and industry partners, with registration opening March 1 and the competition concluding at the September conference in Sydney.

Chinese textICDAR2019OCR

0 likes · 5 min read

ICDAR 2019 Chinese Signboard Text Recognition Challenge (Meituan Dataset)

Alibaba Cloud Developer

Jan 2, 2019 · Artificial Intelligence

How AI Detects Screenshot Bugs: From CNN Models to Image Clustering

Leveraging TensorFlow's CNN and OCR‑LSTM models, this article details how AI can automatically spot blank pages, UI anomalies, and garbled text in app screenshots, and describes a Jenkins‑driven retraining pipeline and hierarchical clustering to de‑duplicate images and boost manual review efficiency.

AICNNClustering

0 likes · 7 min read

How AI Detects Screenshot Bugs: From CNN Models to Image Clustering

iQIYI Technical Product Team

Dec 28, 2018 · Artificial Intelligence

AI‑Driven Visual Automation Testing Frameworks: Challenges, Opportunities, and the Aion Solution

The article examines shortcomings of traditional visual automation frameworks—weak cross‑platform support, ID dependence, and fragile screenshot matching—and shows how Aion’s hybrid approach, merging image‑processing segmentation with deep‑learning classification and OCR, delivers a more stable, cross‑platform, “visible‑to‑obtain” testing solution while acknowledging remaining accuracy challenges.

AI testingOCRUI2Code

0 likes · 11 min read

AI‑Driven Visual Automation Testing Frameworks: Challenges, Opportunities, and the Aion Solution

MaGe Linux Operations

Nov 18, 2018 · Artificial Intelligence

How to Crack Image Captchas with Python: Generation, Pre‑processing, and OCR

This tutorial walks through the four main captcha types, focuses on image captchas, explains generation with the Claptcha library, details preprocessing steps such as grayscale conversion, binarization, denoising, and character segmentation, and demonstrates recognition using tesserocr, while showing the impact of noise and interference lines.

Image processingOCRcaptcha

0 likes · 6 min read

How to Crack Image Captchas with Python: Generation, Pre‑processing, and OCR

Youku Technology

Nov 2, 2018 · Artificial Intelligence

How AI Powers Next‑Gen Multimedia Content Retrieval: From OCR to Knowledge Graphs

This article examines the evolution of search, defines multimedia content retrieval, explores user scenarios such as voice, image, and video input, and details key AI techniques—including OCR, face recognition, and content knowledge graphs—that enable semantic understanding and ranking of video content.

Knowledge GraphOCRface recognition

0 likes · 12 min read

How AI Powers Next‑Gen Multimedia Content Retrieval: From OCR to Knowledge Graphs

Xianyu Technology

Nov 2, 2018 · Artificial Intelligence

FireEye AI-Powered Automated Testing Framework: Architecture, Model Selection, and Retraining

FireEye is an AI‑driven automated UI testing framework that ingests simulated and real screenshots, preprocesses images and OCR text, and employs a CNN for page anomalies, an SSD detector for control anomalies, and an LSTM‑based classifier for text anomalies, with Jenkins‑triggered retraining, cloud model storage, and API serving, aiming to simplify testing and enable future AutoML enhancements.

AIKerasOCR

0 likes · 9 min read

FireEye AI-Powered Automated Testing Framework: Architecture, Model Selection, and Retraining

Alibaba Cloud Developer

Sep 25, 2018 · Artificial Intelligence

How Deep Learning Unwarps Curved Document Images for Better OCR

This article explores the challenges of OCR on warped document images, reviews traditional and deep‑learning‑based correction methods, describes a synthetic dataset generation pipeline, proposes enhanced U‑Net architectures including stacked and dilated variants, evaluates them with MS‑SSIM, and outlines future research directions.

OCRU-Netdeep learning

0 likes · 18 min read

Tencent Cloud Developer

Aug 10, 2018 · Artificial Intelligence

Overview of OCR Technology and Its Applications on Tencent Cloud

The talk outlines OCR’s evolution from early postal-code readers to modern deep‑learning models, explains Tencent Cloud’s fast, accurate services for printed and handwritten text—including table‑structured and general OCR—and showcases real‑world applications such as ID cards, business cards, license plates, checks, and medical documents while highlighting ongoing challenges and future enhancements.

AICloud ServicesOCR

0 likes · 19 min read

Overview of OCR Technology and Its Applications on Tencent Cloud

Tencent Cloud Developer

Aug 1, 2018 · Artificial Intelligence

How AI Powers Real-World Apps: From Face Filters to Medical Imaging

The July 28 Tencent Cloud community salon in Beijing gathered five AI experts who demonstrated practical AI applications—including computer‑vision face filters, OCR services, smart construction attendance, game AI, and breast‑cancer detection—showing how cloud‑based models, data pipelines, and deployment strategies turn research into usable products.

AIOCRcloud AI

0 likes · 21 min read

How AI Powers Real-World Apps: From Face Filters to Medical Imaging

Meituan Technology Team

Jun 28, 2018 · Artificial Intelligence

Deep Learning-Based OCR Techniques at Meituan

Meituan’s OCR system replaces the classic preprocess‑segment‑recognize pipeline with deep‑learning components—CNN‑based text detection, synthetic‑data‑trained character models, and BLSTM‑CTC sequence recognition—delivering far higher accuracy on noisy, varied real‑world images such as menus, receipts, and IDs, though further integration with layout analysis remains needed.

OCRSequence Learningcomputer vision

0 likes · 22 min read

Deep Learning-Based OCR Techniques at Meituan

Ctrip Technology

May 2, 2018 · Artificial Intelligence

Document OCR: From Computer Vision Fundamentals to Ctrip's Full-Text OCR Implementation

This article explains the evolution of optical character recognition, outlines the complete OCR processing pipeline—including image input, preprocessing, binarization, noise removal, tilt correction, layout analysis, character segmentation, recognition, and post‑processing—while showcasing Ctrip's real‑world OCR project, its architecture, accuracy metrics, and key computer‑vision techniques such as CNN, HSV, HOG, LBP, and Haar features.

CNNImage processingOCR

0 likes · 13 min read

Document OCR: From Computer Vision Fundamentals to Ctrip's Full-Text OCR Implementation

MaGe Linux Operations

Mar 4, 2018 · Artificial Intelligence

How to Crack Image CAPTCHAs with Python: From Noise Reduction to OCR

This guide walks through the complete process of recognizing image CAPTCHAs using Python, covering graphics fundamentals, noise reduction, grayscale conversion, binarization, image segmentation, and OCR with PIL and pytesser, complete with installation steps and code examples.

OCRPILPython

0 likes · 7 min read

How to Crack Image CAPTCHAs with Python: From Noise Reduction to OCR

MaGe Linux Operations

Oct 7, 2017 · Artificial Intelligence

How to Crack Image Captchas with Python: Grayscale, Binarization, and Tesserocr

This tutorial explains the four main captcha types, focuses on image‑based captchas, and walks through generating, preprocessing (grayscale, contrast, binarization, denoising, skew correction), and recognizing them with Python's Claptcha library and the Tesserocr OCR engine.

Image processingOCRPython

0 likes · 7 min read

How to Crack Image Captchas with Python: Grayscale, Binarization, and Tesserocr

Baidu Intelligent Testing

Sep 14, 2017 · Information Security

Automating Security Detection with Image Recognition: Workflow and Techniques

This article explains why security detection needs automation, compares static and dynamic analysis, and details an image‑recognition‑based pipeline—including grayscale conversion, edge detection, contour extraction, and OCR—to automatically identify risky app pop‑up warnings.

AutomationDynamic AnalysisOCR

0 likes · 8 min read

Automating Security Detection with Image Recognition: Workflow and Techniques

Alibaba Cloud Developer

Jul 24, 2017 · Artificial Intelligence

How Alibaba’s AI Beats the KITTI Benchmark and Revolutionizes Visual Shopping

Alibaba’s AI breakthroughs—from a foot‑scanning shopping demo that lets a Google engineer instantly find matching shoes, to a record‑setting vehicle detection model on KITTI and world‑leading OCR for real‑time image review—showcase the power and commercial potential of modern computer‑vision research.

AIOCRcomputer vision

0 likes · 5 min read

How Alibaba’s AI Beats the KITTI Benchmark and Revolutionizes Visual Shopping

Tongcheng Travel Technology Center

May 5, 2017 · Artificial Intelligence

Improving Passport OCR: Process, Preprocessing, and Prior Knowledge Corrections

This article outlines a comprehensive OCR workflow for passport recognition, covering image acquisition, preprocessing techniques, engine integration, and prior‑knowledge corrections to enhance accuracy and user experience, while sharing practical insights and performance results.

AIImage processingOCR

0 likes · 8 min read

Improving Passport OCR: Process, Preprocessing, and Prior Knowledge Corrections

Meituan Technology Team

Feb 10, 2017 · Artificial Intelligence

Deep Learning Applications in Semantic Matching, Image Quality Ranking, and OCR at Meituan-Dianping

Meituan‑Dianping leverages deep‑learning models—including ClickNet for semantic search matching, an AlexNet‑based image‑quality ranker, and a Faster‑RCNN/FCN‑driven OCR pipeline—to personalize results, select attractive POI images, and extract text, achieving higher click‑through rates, conversions, and operational efficiency across its O2O services.

AI ApplicationsMeituanOCR

0 likes · 13 min read

Deep Learning Applications in Semantic Matching, Image Quality Ranking, and OCR at Meituan-Dianping

Qunar Tech Salon

Dec 5, 2016 · Artificial Intelligence

Understanding Convolutional Neural Networks for OCR and CAPTCHA Recognition

This article introduces the fundamentals of neural networks for image recognition, explains regression vs classification, describes convolution, pooling and fully connected layers, illustrates the classic LeNet‑5 model on the MNIST dataset, and shows how a TensorFlow‑based CNN can be trained to recognize CAPTCHA images, achieving high accuracy.

CNNLeNet-5OCR

0 likes · 10 min read

Understanding Convolutional Neural Networks for OCR and CAPTCHA Recognition

Qunar Tech Salon

Aug 8, 2016 · Artificial Intelligence

OCR Technology Overview and Implementation Steps for Card Number Recognition

This article provides a comprehensive overview of OCR technology, explains its definition and application scenarios, and details a five‑step workflow—including target extraction, preprocessing, character localization, digit matching, and format validation—specifically illustrated with bank card number recognition.

Bank Card RecognitionImage processingMorphological Operations

0 likes · 9 min read

OCR Technology Overview and Implementation Steps for Card Number Recognition

Ctrip Technology

Jun 29, 2015 · Artificial Intelligence

Bank Card Scanning and Recognition: Extending Support for Chinese Debit Cards

This article describes a project that enhances an open‑source card‑number scanning solution to recognize 19‑digit Chinese debit cards, addressing challenges such as black‑printed fonts, light‑colored embossed fonts, background filtering, single‑character OCR, and Luhn‑based checksum verification.

Bank Card RecognitionImage processingOCR

0 likes · 6 min read

Bank Card Scanning and Recognition: Extending Support for Chinese Debit Cards