Tagged articles
228 articles
Page 3 of 3
iQIYI Technical Product Team
iQIYI Technical Product Team
Aug 23, 2019 · Mobile Development

iQIYI One‑Click Android App Health Check: Architecture, Implementation and Key Technologies

iQIYI’s one‑click Android app health check provides a lightweight, universal solution that automatically installs, traverses, and analyzes apps on a 100‑device cloud farm using ATX‑based drivers, OCR‑driven UI interaction, deep‑learning UI anomaly detection, static security analysis, performance metrics, and crash/ANR reporting, seamlessly integrating into CI pipelines.

ATXAndroidApp Inspection
0 likes · 19 min read
iQIYI One‑Click Android App Health Check: Architecture, Implementation and Key Technologies
Tencent Cloud Developer
Tencent Cloud Developer
Jun 5, 2019 · Artificial Intelligence

Tencent Cloud OCR Technology: Principles, Challenges, and Industry Applications

Tencent Cloud OCR leverages deep‑learning‑based text detection and recognition, including Compact Inception and multi‑layer RNN refinements, to overcome challenges such as complex backgrounds, low resolution, and multilingual layouts, delivering over 90% accuracy for ID cards, bank cards, business licenses, handwritten text, and powering fast, cost‑saving applications in logistics, QQ, and WeChat Work.

Deep LearningImage ProcessingOCR
0 likes · 7 min read
Tencent Cloud OCR Technology: Principles, Challenges, and Industry Applications
Youku Technology
Youku Technology
May 13, 2019 · Artificial Intelligence

How Youku Tackles Multimodal Video Understanding and Quality Control

This article outlines Youku's multimodal video content understanding pipeline, covering business needs, problem decomposition, data construction, model selection, OCR subtitle extraction, scene and action recognition, sample augmentation, noise handling, and multimodal fusion strategies for robust content moderation.

AIComputer VisionOCR
0 likes · 11 min read
How Youku Tackles Multimodal Video Understanding and Quality Control
MaGe Linux Operations
MaGe Linux Operations
Apr 9, 2019 · Artificial Intelligence

How to Build and Crack Image Captchas with Python and Tesserocr

This tutorial explains the types of captchas, demonstrates how to generate image captchas using the Claptcha library, outlines preprocessing steps such as grayscale conversion, binarization, and denoising, and shows how to recognize them with the Tesserocr OCR engine, including handling noise and interference lines.

CaptchaImage ProcessingOCR
0 likes · 7 min read
How to Build and Crack Image Captchas with Python and Tesserocr
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 6, 2019 · Artificial Intelligence

How Deep Learning Unwarps Curved Document Images for Better OCR

This article explores how deep‑learning‑based image dewarping techniques, from traditional hardware methods to modern U‑Net, Stacked U‑Net and Dilated U‑Net architectures, can correct warped document photos, improve OCR accuracy, and support intelligent verification in high‑throughput business scenarios.

Deep LearningModel EvaluationOCR
0 likes · 19 min read
How Deep Learning Unwarps Curved Document Images for Better OCR
Ctrip Technology
Ctrip Technology
Feb 28, 2019 · Artificial Intelligence

OCR Techniques and Solutions for Ctrip Business: Deep Learning Based Text Detection and Recognition

This article presents an overview of computer‑vision based OCR in Ctrip's operations, detailing deep‑learning text detection methods for controlled and uncontrolled scenarios, sequence‑based recognition models, training strategies with synthetic data, and performance results, while discussing current challenges and future improvements.

AIComputer VisionCtrip
0 likes · 11 min read
OCR Techniques and Solutions for Ctrip Business: Deep Learning Based Text Detection and Recognition
Meituan Technology Team
Meituan Technology Team
Feb 28, 2019 · Artificial Intelligence

ICDAR 2019 Chinese Signboard Text Recognition Challenge (Meituan Dataset)

Meituan’s new 25,000‑image Chinese signboard dataset—captured across varied devices, locations, lighting and angles—serves as the benchmark for the ICDAR 2019 Robust Reading Challenge, which features four tasks (end‑to‑end recognition, text‑line localization, single‑character and string recognition) with a 20k/2k/3k train/validation/test split, character‑level annotations, and participation from academic and industry partners, with registration opening March 1 and the competition concluding at the September conference in Sydney.

Chinese textICDAR2019OCR
0 likes · 5 min read
ICDAR 2019 Chinese Signboard Text Recognition Challenge (Meituan Dataset)
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 2, 2019 · Artificial Intelligence

How AI Detects Screenshot Bugs: From CNN Models to Image Clustering

Leveraging TensorFlow's CNN and OCR‑LSTM models, this article details how AI can automatically spot blank pages, UI anomalies, and garbled text in app screenshots, and describes a Jenkins‑driven retraining pipeline and hierarchical clustering to de‑duplicate images and boost manual review efficiency.

AICNNOCR
0 likes · 7 min read
How AI Detects Screenshot Bugs: From CNN Models to Image Clustering
iQIYI Technical Product Team
iQIYI Technical Product Team
Dec 28, 2018 · Artificial Intelligence

AI‑Driven Visual Automation Testing Frameworks: Challenges, Opportunities, and the Aion Solution

The article examines shortcomings of traditional visual automation frameworks—weak cross‑platform support, ID dependence, and fragile screenshot matching—and shows how Aion’s hybrid approach, merging image‑processing segmentation with deep‑learning classification and OCR, delivers a more stable, cross‑platform, “visible‑to‑obtain” testing solution while acknowledging remaining accuracy challenges.

AI testingOCRUI2Code
0 likes · 11 min read
AI‑Driven Visual Automation Testing Frameworks: Challenges, Opportunities, and the Aion Solution
MaGe Linux Operations
MaGe Linux Operations
Nov 18, 2018 · Artificial Intelligence

How to Crack Image Captchas with Python: Generation, Pre‑processing, and OCR

This tutorial walks through the four main captcha types, focuses on image captchas, explains generation with the Claptcha library, details preprocessing steps such as grayscale conversion, binarization, denoising, and character segmentation, and demonstrates recognition using tesserocr, while showing the impact of noise and interference lines.

CaptchaImage ProcessingOCR
0 likes · 6 min read
How to Crack Image Captchas with Python: Generation, Pre‑processing, and OCR
Youku Technology
Youku Technology
Nov 2, 2018 · Artificial Intelligence

How AI Powers Next‑Gen Multimedia Content Retrieval: From OCR to Knowledge Graphs

This article examines the evolution of search, defines multimedia content retrieval, explores user scenarios such as voice, image, and video input, and details key AI techniques—including OCR, face recognition, and content knowledge graphs—that enable semantic understanding and ranking of video content.

Knowledge GraphOCRface recognition
0 likes · 12 min read
How AI Powers Next‑Gen Multimedia Content Retrieval: From OCR to Knowledge Graphs
Xianyu Technology
Xianyu Technology
Nov 2, 2018 · Artificial Intelligence

FireEye AI-Powered Automated Testing Framework: Architecture, Model Selection, and Retraining

FireEye is an AI‑driven automated UI testing framework that ingests simulated and real screenshots, preprocesses images and OCR text, and employs a CNN for page anomalies, an SSD detector for control anomalies, and an LSTM‑based classifier for text anomalies, with Jenkins‑triggered retraining, cloud model storage, and API serving, aiming to simplify testing and enable future AutoML enhancements.

AIAutomated TestingKeras
0 likes · 9 min read
FireEye AI-Powered Automated Testing Framework: Architecture, Model Selection, and Retraining
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 25, 2018 · Artificial Intelligence

How Deep Learning Unwarps Curved Document Images for Better OCR

This article explores the challenges of OCR on warped document images, reviews traditional and deep‑learning‑based correction methods, describes a synthetic dataset generation pipeline, proposes enhanced U‑Net architectures including stacked and dilated variants, evaluates them with MS‑SSIM, and outlines future research directions.

Deep LearningOCRU-Net
0 likes · 18 min read
How Deep Learning Unwarps Curved Document Images for Better OCR
Tencent Cloud Developer
Tencent Cloud Developer
Aug 10, 2018 · Artificial Intelligence

Overview of OCR Technology and Its Applications on Tencent Cloud

The talk outlines OCR’s evolution from early postal-code readers to modern deep‑learning models, explains Tencent Cloud’s fast, accurate services for printed and handwritten text—including table‑structured and general OCR—and showcases real‑world applications such as ID cards, business cards, license plates, checks, and medical documents while highlighting ongoing challenges and future enhancements.

AICloud ServicesDeep Learning
0 likes · 19 min read
Overview of OCR Technology and Its Applications on Tencent Cloud
Tencent Cloud Developer
Tencent Cloud Developer
Aug 1, 2018 · Artificial Intelligence

How AI Powers Real-World Apps: From Face Filters to Medical Imaging

The July 28 Tencent Cloud community salon in Beijing gathered five AI experts who demonstrated practical AI applications—including computer‑vision face filters, OCR services, smart construction attendance, game AI, and breast‑cancer detection—showing how cloud‑based models, data pipelines, and deployment strategies turn research into usable products.

AICloud AIComputer Vision
0 likes · 21 min read
How AI Powers Real-World Apps: From Face Filters to Medical Imaging
Meituan Technology Team
Meituan Technology Team
Jun 28, 2018 · Artificial Intelligence

Deep Learning-Based OCR Techniques at Meituan

Meituan’s OCR system replaces the classic preprocess‑segment‑recognize pipeline with deep‑learning components—CNN‑based text detection, synthetic‑data‑trained character models, and BLSTM‑CTC sequence recognition—delivering far higher accuracy on noisy, varied real‑world images such as menus, receipts, and IDs, though further integration with layout analysis remains needed.

Computer VisionOCRSequence Learning
0 likes · 22 min read
Deep Learning-Based OCR Techniques at Meituan
Ctrip Technology
Ctrip Technology
May 2, 2018 · Artificial Intelligence

Document OCR: From Computer Vision Fundamentals to Ctrip's Full-Text OCR Implementation

This article explains the evolution of optical character recognition, outlines the complete OCR processing pipeline—including image input, preprocessing, binarization, noise removal, tilt correction, layout analysis, character segmentation, recognition, and post‑processing—while showcasing Ctrip's real‑world OCR project, its architecture, accuracy metrics, and key computer‑vision techniques such as CNN, HSV, HOG, LBP, and Haar features.

CNNComputer VisionImage Processing
0 likes · 13 min read
Document OCR: From Computer Vision Fundamentals to Ctrip's Full-Text OCR Implementation
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 24, 2017 · Artificial Intelligence

How Alibaba’s AI Beats the KITTI Benchmark and Revolutionizes Visual Shopping

Alibaba’s AI breakthroughs—from a foot‑scanning shopping demo that lets a Google engineer instantly find matching shoes, to a record‑setting vehicle detection model on KITTI and world‑leading OCR for real‑time image review—showcase the power and commercial potential of modern computer‑vision research.

AIComputer VisionDeep Learning
0 likes · 5 min read
How Alibaba’s AI Beats the KITTI Benchmark and Revolutionizes Visual Shopping
Meituan Technology Team
Meituan Technology Team
Feb 10, 2017 · Artificial Intelligence

Deep Learning Applications in Semantic Matching, Image Quality Ranking, and OCR at Meituan-Dianping

Meituan‑Dianping leverages deep‑learning models—including ClickNet for semantic search matching, an AlexNet‑based image‑quality ranker, and a Faster‑RCNN/FCN‑driven OCR pipeline—to personalize results, select attractive POI images, and extract text, achieving higher click‑through rates, conversions, and operational efficiency across its O2O services.

AI applicationsMeituanOCR
0 likes · 13 min read
Deep Learning Applications in Semantic Matching, Image Quality Ranking, and OCR at Meituan-Dianping
Qunar Tech Salon
Qunar Tech Salon
Dec 5, 2016 · Artificial Intelligence

Understanding Convolutional Neural Networks for OCR and CAPTCHA Recognition

This article introduces the fundamentals of neural networks for image recognition, explains regression vs classification, describes convolution, pooling and fully connected layers, illustrates the classic LeNet‑5 model on the MNIST dataset, and shows how a TensorFlow‑based CNN can be trained to recognize CAPTCHA images, achieving high accuracy.

CNNCaptchaLeNet-5
0 likes · 10 min read
Understanding Convolutional Neural Networks for OCR and CAPTCHA Recognition
Qunar Tech Salon
Qunar Tech Salon
Aug 8, 2016 · Artificial Intelligence

OCR Technology Overview and Implementation Steps for Card Number Recognition

This article provides a comprehensive overview of OCR technology, explains its definition and application scenarios, and details a five‑step workflow—including target extraction, preprocessing, character localization, digit matching, and format validation—specifically illustrated with bank card number recognition.

Bank Card RecognitionComputer VisionImage Processing
0 likes · 9 min read
OCR Technology Overview and Implementation Steps for Card Number Recognition
Ctrip Technology
Ctrip Technology
Jun 29, 2015 · Artificial Intelligence

Bank Card Scanning and Recognition: Extending Support for Chinese Debit Cards

This article describes a project that enhances an open‑source card‑number scanning solution to recognize 19‑digit Chinese debit cards, addressing challenges such as black‑printed fonts, light‑colored embossed fonts, background filtering, single‑character OCR, and Luhn‑based checksum verification.

Bank Card RecognitionComputer VisionImage Processing
0 likes · 6 min read
Bank Card Scanning and Recognition: Extending Support for Chinese Debit Cards