Tagged articles

image recognition

77 articles · Page 1 of 1

Jun 18, 2026 · Artificial Intelligence

DeepSeek’s New Image‑Recognition Mode Struggles to Identify Its Own CEO

After DeepSeek fully launched its image‑recognition mode, a hands‑on test revealed that while the model can spot well‑known figures like Huang Renxun, it misreads text, fails on Chinese handwriting, cannot recognize its CEO Liang Wenfeng, and lags behind Gemini, GPT 5.5 and Claude in music‑theory reasoning.

AI comparisonBenchmarkDeepSeek

0 likes · 6 min read

DeepSeek’s New Image‑Recognition Mode Struggles to Identify Its Own CEO

Java Architect Essentials

Apr 17, 2026 · Backend Development

How to Integrate Tess4J OCR into a Spring Boot Application

This article explains OCR fundamentals, introduces Tesseract and its Java wrapper Tess4J, guides you through downloading language data, shows step‑by‑step Spring Boot integration with Maven dependencies and configuration classes, and provides test code for Chinese, English, and mixed‑language image recognition.

Language DataOCRSpring Boot

0 likes · 9 min read

How to Integrate Tess4J OCR into a Spring Boot Application

Java Architect Handbook

Apr 1, 2026 · Backend Development

Integrating Tess4j OCR into a Spring Boot 3 Project

This guide explains OCR fundamentals, introduces Tesseract and Tess4j, shows how to download the required language data files, and provides step‑by‑step instructions with Maven configuration, Spring Boot properties, Java code, and test examples for Chinese, English, and mixed‑language image recognition.

OCRSpring Bootimage recognition

0 likes · 11 min read

Integrating Tess4j OCR into a Spring Boot 3 Project

Architecture Digest

Mar 26, 2026 · Artificial Intelligence

How to Integrate Tess4j OCR into a Spring Boot 3 Application

This guide explains the fundamentals of OCR, introduces Tesseract and its Java wrapper Tess4j, shows how to download language data files, configure a Spring Boot 3 project with Maven dependencies and YAML settings, and provides comprehensive test code for Chinese, English, and mixed‑language image recognition.

Artificial IntelligenceOCRSpring Boot

0 likes · 9 min read

How to Integrate Tess4j OCR into a Spring Boot 3 Application

SpringMeng

Mar 25, 2026 · Backend Development

How to Perform OCR in SpringBoot Using Tess4j

This tutorial explains OCR fundamentals, introduces Tesseract and its Java wrapper Tess4j, shows how to download language data, integrate Tess4j into a SpringBoot 3 project with Maven configuration, and provides test code for Chinese, English, and mixed‑language image recognition while highlighting performance considerations.

ConfigurationOCRimage recognition

0 likes · 9 min read

How to Perform OCR in SpringBoot Using Tess4j

java1234

Mar 24, 2026 · Backend Development

How to Elegantly Perform OCR in Spring Boot 3 Using Tess4J

This tutorial explains OCR fundamentals, introduces the open‑source Tesseract engine and its Java wrapper Tess4J, shows how to download the required traineddata files, and provides step‑by‑step Spring Boot 3 integration, configuration, and test code for Chinese, English, and mixed‑language image recognition, plus important usage notes.

OCRSpring Bootimage recognition

0 likes · 8 min read

How to Elegantly Perform OCR in Spring Boot 3 Using Tess4J

Java Companion

Mar 22, 2026 · Backend Development

How to Seamlessly Integrate Tess4j OCR into a SpringBoot Application

This tutorial walks through the fundamentals of OCR, explains how to download the required Tesseract traineddata files, shows how to add Tess4j as a Maven dependency, configure SpringBoot with custom properties, and provides complete Java test code for Chinese, English, and mixed‑language image recognition, highlighting performance considerations and file‑naming requirements.

OCRbackendimage recognition

0 likes · 9 min read

How to Seamlessly Integrate Tess4j OCR into a SpringBoot Application

Woodpecker Software Testing

Jan 21, 2026 · Artificial Intelligence

Build an AI Agent with FastAPI & Alibaba Cloud: Text Q&A, Image Recognition, and Text‑to‑Image

This guide walks through designing and implementing an AI assistant that connects FastAPI to Alibaba Cloud large‑model services, supports streaming text Q&A, image understanding, text‑to‑image generation, network search, and MCP‑based map queries, with full front‑end and back‑end code examples.

AI ChatbotAlibaba CloudFastAPI

0 likes · 38 min read

Build an AI Agent with FastAPI & Alibaba Cloud: Text Q&A, Image Recognition, and Text‑to‑Image

Python Programming Learning Circle

Nov 18, 2025 · Game Development

Build a Python Recoil‑Compensation Bot for PUBG Using Image Recognition

This guide explains how to create a non‑intrusive Python bot that automatically compensates weapon recoil in PUBG by capturing the screen, recognizing equipped gear with OpenCV and SSIM, and moving the mouse via pydirectinput based on weapon‑specific recoil tables.

Game AutomationPUBGPython

0 likes · 10 min read

Build a Python Recoil‑Compensation Bot for PUBG Using Image Recognition

Python Programming Learning Circle

Oct 31, 2025 · Backend Development

How to Automate 12306 Login with Selenium and Image Recognition (Full Code)

This tutorial explains how to use Python, Selenium, and an image‑recognition service to bypass 12306's captcha and slider verification, providing a complete code example and detailing the challenges of image verification and anti‑automation measures.

PythonSeleniumWeb Automation

0 likes · 10 min read

How to Automate 12306 Login with Selenium and Image Recognition (Full Code)

Java Architecture Diary

May 19, 2025 · Artificial Intelligence

How Ollama 0.7 Unlocks Local Multimodal AI with One Command

Ollama 0.7 introduces a fully re‑engineered core that brings seamless multimodal model support, lists top visual models, showcases OCR and image analysis capabilities, explains technical breakthroughs, and provides a quick three‑step guide to deploy powerful local AI vision.

AI EngineeringAI modelsOllama

0 likes · 7 min read

How Ollama 0.7 Unlocks Local Multimodal AI with One Command

Spring Full-Stack Practical Cases

May 7, 2025 · Artificial Intelligence

Unlock Multimodal AI with Spring AI: Hands‑On Image & ID Recognition Cases

This article introduces Spring AI's multimodal capabilities, explains the Message API for handling text, image, audio, and video inputs, and provides step‑by‑step Spring Boot examples for image analysis, ID card extraction, and structured JSON output of car‑color counts.

Artificial IntelligenceMultimodalSpring AI

0 likes · 8 min read

Unlock Multimodal AI with Spring AI: Hands‑On Image & ID Recognition Cases

php Courses

Apr 15, 2025 · Artificial Intelligence

Using PHP to Access a Camera and Perform Image Recognition

This article explains how to use PHP to control a camera via extensions such as OpenCV or FFmpeg, integrate image‑recognition libraries like Tesseract OCR, and apply these techniques to scenarios such as security monitoring, object detection, and facial‑recognition login, enhancing application intelligence.

.aiCameraPHP

0 likes · 6 min read

Using PHP to Access a Camera and Perform Image Recognition

Full-Stack Cultivation Path

Mar 7, 2025 · Artificial Intelligence

How AI Turned My Chaotic Home Inventory into an Organized System

The author describes the problems of wasted storage, expired food, hard-to‑locate items, and duplicate purchases after moving house, then details an AI‑driven home inventory app built with Cursor, Trae, and large‑vision models that digitizes, classifies, and reminds about household goods, complete with architecture, implementation steps, and a comparative review of the AI tools used.

.aiCursorGPC classification

0 likes · 15 min read

How AI Turned My Chaotic Home Inventory into an Organized System

ByteFE

Mar 7, 2025 · Artificial Intelligence

AI-Powered Home Inventory Management Application: Design, Implementation, and Experience

This article describes the development of an AI-driven home inventory management tool that addresses storage waste, food expiration, item locating, and duplicate purchases by integrating barcode scanning, image recognition, intelligent classification, and multimodal models, while also comparing the performance of Cursor and Trae IDEs and Claude‑3.5‑sonnet versus deepseek‑r1 models.

.aiSoftware Developmentbarcode

0 likes · 17 min read

AI-Powered Home Inventory Management Application: Design, Implementation, and Experience

Baobao Algorithm Notes

Dec 25, 2024 · Artificial Intelligence

Create a Free Multimodal Calorie Counter with GLM‑4V‑Flash in Minutes

This guide shows how to install the ZhipuAI SDK, obtain a free GLM‑4V‑Flash API key, craft prompts for image‑based calorie estimation, and build a Python demo that calculates food calories, BMI, and personalized diet advice using a multimodal large model.

GLM-4V-FlashMultimodal AIPython

0 likes · 9 min read

Create a Free Multimodal Calorie Counter with GLM‑4V‑Flash in Minutes

Full-Stack Cultivation Path

Nov 25, 2024 · Artificial Intelligence

Get High-Quality OCR with Ollama-OCR in Just a Few Lines of Code

This guide shows how to set up the open‑source Ollama‑OCR tool, which leverages the Llama 3.2‑Vision multimodal model to perform high‑quality OCR, covering installation of Ollama, the vision model, the OCR package, and example code for plain‑text and Markdown outputs.

Llama 3.2-VisionNode.jsOCR

0 likes · 6 min read

Get High-Quality OCR with Ollama-OCR in Just a Few Lines of Code

Baidu Geek Talk

Nov 25, 2024 · Artificial Intelligence

PP-ShiTuV2: A General Image Recognition Pipeline in PaddleX

PP‑ShiTuV2, a PaddleX pipeline that integrates subject detection, deep feature encoding, and vector retrieval, delivers 91 % recall@1 on AliProducts, surpasses earlier models by over 20 points, runs efficiently on GPU and CPU, and offers simple installation, quick‑start code, and full fine‑tuning support.

Deep LearningModel DeploymentPP-ShiTuV2

0 likes · 8 min read

PP-ShiTuV2: A General Image Recognition Pipeline in PaddleX

Python Programming Learning Circle

Aug 24, 2024 · Game Development

Python-Based Non-Intrusive Recoil Compensation for PUBG Using Image Recognition

This article explains how to create a Python script that automatically compensates weapon recoil in PUBG by capturing the screen, recognizing equipment with OpenCV and SSIM, and moving the mouse via pynput and pydirectinput without modifying the game memory.

Game AutomationPUBGPython

0 likes · 10 min read

Python-Based Non-Intrusive Recoil Compensation for PUBG Using Image Recognition

DaTaobao Tech

May 17, 2024 · Artificial Intelligence

Understanding Convolutional Neural Networks: Theory, Architecture, and Practical Techniques

The article explains CNN fundamentals—convolution, pooling, and fully‑connected layers—illustrates their implementation for American Sign Language letter recognition, details parameter calculations, demonstrates data augmentation and transfer learning techniques, and highlights how these methods boost image‑classification accuracy to around 92%.

CNNdata augmentationimage recognition

0 likes · 19 min read

Understanding Convolutional Neural Networks: Theory, Architecture, and Practical Techniques

php Courses

May 10, 2024 · Artificial Intelligence

Using PHP to Operate a Camera and Perform Image Recognition

This article explains how to use PHP together with camera control libraries and image‑recognition tools such as OpenCV and Tesseract OCR to build intelligent applications, providing code examples and discussing practical use cases like security monitoring and face‑login.

CameraPHPimage recognition

0 likes · 5 min read

Using PHP to Operate a Camera and Perform Image Recognition

The Dominant Programmer

Mar 30, 2024 · Backend Development

Implement OCR in Spring Boot with Tess4J for Image Text Recognition

This guide shows how to integrate the open‑source Tesseract OCR engine into a Spring Boot application using the Tess4J Java wrapper, covering Chinese language data setup, Maven dependency configuration, bean creation, service implementation, and a unit test to verify image text extraction.

OCRSpring Bootimage recognition

0 likes · 6 min read

Implement OCR in Spring Boot with Tess4J for Image Text Recognition

Open Source Tech Hub

Mar 13, 2024 · Artificial Intelligence

How to Use Google Gemini AI in PHP to Solve Image CAPTCHAs

This guide shows how to set up a PHP project, install the Gemini PHP client, and use Google Gemini's multimodal model to recognize text and solve image CAPTCHAs, providing complete code examples, dependency instructions, and sample outputs.

Artificial IntelligenceGemini AIPHP

0 likes · 6 min read

How to Use Google Gemini AI in PHP to Solve Image CAPTCHAs

NetEase Cloud Music Tech Team

Dec 21, 2023 · Artificial Intelligence

Video and Image Technologies in NetEase Cloud Music: Architecture, Algorithms, and Applications

The article examines NetEase Cloud Music’s video and image technology stack—covering a four‑module architecture, algorithms for content understanding, intelligent production, moderation, and interactive effects—and explains how these systems enhance user experience, streamline backend processing, and position the platform for future AIGC‑driven innovations.

AI AlgorithmsMultimodal LearningVideo Processing

0 likes · 11 min read

Video and Image Technologies in NetEase Cloud Music: Architecture, Algorithms, and Applications

Rare Earth Juejin Tech Community

Dec 11, 2023 · Frontend Development

Bypassing Juejin Slider Captcha with Puppeteer and Canvas Image Recognition

This article demonstrates how to use Puppeteer and the Canvas API to automate login on Juejin, extract the slider captcha image, apply grayscale and binarization processing to locate the gap, calculate the required drag distance, and simulate human‑like mouse movements with easing functions for successful verification.

captchaimage recognitionweb-scraping

0 likes · 17 min read

Bypassing Juejin Slider Captcha with Puppeteer and Canvas Image Recognition

MaGe Linux Operations

Nov 19, 2023 · Artificial Intelligence

Build and Train a Python CNN for Image & Face Recognition with TensorFlow

Learn step-by-step how to create, compile, train, evaluate, and deploy convolutional neural networks in Python using TensorFlow and Keras for general image classification and a practical face‑recognition example, complete with code snippets and data‑preprocessing techniques.

CNNDeep LearningTensorFlow

0 likes · 7 min read

Build and Train a Python CNN for Image & Face Recognition with TensorFlow

MoonWebTeam

Nov 9, 2023 · Mobile Development

Master Mobile E2E Testing with Appium: Setup, Principles, and Real‑World Examples

This comprehensive guide explains Appium’s cross‑platform architecture, walks through setting up an Android testing environment on macOS, demonstrates a full‑stack test case for an in‑app H5 page, and shares advanced techniques like a WebSocket‑based JS agent and OpenCV image‑recognition for challenging hybrid scenarios.

AndroidAppiumE2E automation

0 likes · 16 min read

Master Mobile E2E Testing with Appium: Setup, Principles, and Real‑World Examples

Huolala Tech

Sep 28, 2023 · Artificial Intelligence

How Mobile AI Transforms Logistics: Real‑World Image Algorithms at Huolala

This article explores Huolala's deployment of mobile AI image algorithms for driver document verification and vehicle sticker inspection, detailing model design, lightweighting, hybrid processing, data stream handling, and on‑device deployment that boost efficiency, privacy, and real‑time performance in logistics operations.

edge computingimage recognitionlogistics

0 likes · 13 min read

How Mobile AI Transforms Logistics: Real‑World Image Algorithms at Huolala

Rare Earth Juejin Tech Community

Aug 6, 2023 · Artificial Intelligence

Explaining Image Recognition: Logistic Regression and Convolutional Neural Networks

This article introduces the principles of image recognition, compares traditional logistic regression with convolutional neural networks, demonstrates their implementation using Python code, visualizes model weights, and explains key concepts such as padding, convolution, pooling, receptive fields, and multi‑layer feature extraction.

convolutional neural networkexplainable AIimage recognition

0 likes · 12 min read

Explaining Image Recognition: Logistic Regression and Convolutional Neural Networks

php Courses

Jun 21, 2023 · Backend Development

Using PHP to Recognize QR Codes and Output Their Content

This article explains how to use the PHP library phpqrcode (via Zxing) to read QR code images, extract their text content, and display it in a web browser, including installation steps and sample code.

PHPQR codeimage recognition

0 likes · 5 min read

Using PHP to Recognize QR Codes and Output Their Content

Python Programming Learning Circle

Jun 6, 2023 · Game Development

Python‑Based Recoil Compensation for PUBG Using Image Recognition and Mouse Automation

This article explains how to build a Python tool that automatically compensates weapon recoil in PUBG by capturing the screen, recognizing equipment with OpenCV and SSIM, and moving the mouse via pynput, pyautogui, and pydirectinput based on weapon data and user input.

Game AutomationRecoil Compensationimage recognition

0 likes · 11 min read

Python‑Based Recoil Compensation for PUBG Using Image Recognition and Mouse Automation

Python Programming Learning Circle

Mar 21, 2023 · Artificial Intelligence

Analyzing WeChat Friend Data with Python: Gender, Avatar, Signature, and Location Insights

This tutorial demonstrates how to use Python libraries such as itchat, jieba, matplotlib, SnowNLP, and Tencent Youtu SDK to collect WeChat friend information and perform data analysis on gender distribution, avatar characteristics, signature text (including word‑cloud and sentiment analysis), and geographic location, presenting the results with visual charts and maps.

NLPWeChatdata-analysis

0 likes · 14 min read

Zhuanzhuan Tech

Oct 20, 2022 · Artificial Intelligence

Automated Image Review System for Second‑Hand Product Listings on ZhiZhuan Platform

This article describes how ZhiZhuan’s B2C marketplace implemented an automated image review system using computer‑vision techniques such as image matching, regression and detection to verify product‑image consistency, clarity, anti‑tamper labels, cleanliness and centering, achieving a 50% reduction in manual workload.

image recognitionproduct verification

0 likes · 16 min read

Automated Image Review System for Second‑Hand Product Listings on ZhiZhuan Platform

Huolala Tech

Sep 10, 2022 · Artificial Intelligence

How AI Transforms Freight Safety: Real‑Time Risk Detection and Intervention

This article explains how AI technologies enable end‑to‑end freight safety monitoring, from pre‑trip and in‑trip risk identification to targeted interventions and governance, addressing challenges such as long‑tail data, small‑sample learning, fine‑grained classification, and multi‑level filtering.

.aiDeep LearningRisk Detection

0 likes · 12 min read

How AI Transforms Freight Safety: Real‑Time Risk Detection and Intervention

DataFunTalk

Jul 12, 2022 · Artificial Intelligence

Applying Computer Vision for Content Safety in Live Streaming: Practices and Future Directions

This presentation details how Huya leverages computer‑vision algorithms to detect and mitigate risky content such as political, pornographic, and violent material in live‑streaming and short‑video platforms, describing system architecture, labeling strategies, algorithmic pipelines, real‑time moderation techniques, and future research directions.

AI safetyLive StreamingRisk Detection

0 likes · 11 min read

Applying Computer Vision for Content Safety in Live Streaming: Practices and Future Directions

ITPUB

Jun 9, 2022 · Artificial Intelligence

How 58’s Multi‑Label Image Recognition Boosts Semantic Search and Recommendations

This article details the design, data pipeline, model architecture, loss functions, and evaluation metrics of a large‑scale multi‑label image classification system built for 58.com, showing how it improves semantic similarity detection, recommendation, and content moderation across diverse business domains.

Deep LearningLarge-Scale Dataasymmetric loss

0 likes · 18 min read

How 58’s Multi‑Label Image Recognition Boosts Semantic Search and Recommendations

DataFunTalk

May 28, 2022 · Artificial Intelligence

Adversarial Examples for Captcha: Techniques, Applications, and Future Directions

This article presents a comprehensive overview of adversarial example research applied to captcha systems, covering the definition and history of adversarial attacks, geometric‑aware generation frameworks, FGSM‑based attack variants, experimental results, trade‑offs between image quality and attack strength, and future work such as AdvGAN integration.

AI safetyDeep LearningFGSM

0 likes · 14 min read

Adversarial Examples for Captcha: Techniques, Applications, and Future Directions

Code DAO

Dec 2, 2021 · Artificial Intelligence

Transfer Learning with ShuffleNetV2 for Flower Classification

This article walks through building a PyTorch ShuffleNetV2 model, preparing the Kaggle Flowers dataset, training with transfer learning on a GPU, visualizing loss and accuracy, and performing inference on five test images, achieving nearly 90% validation accuracy after 95 epochs.

CNNPyTorchShuffleNetV2

0 likes · 19 min read

Transfer Learning with ShuffleNetV2 for Flower Classification

Python Programming Learning Circle

Nov 26, 2021 · Mobile Development

Automating Princess Connect on Android with Python, ADB, and OpenCV

This tutorial demonstrates how to use Python, ADB, and OpenCV to automate gameplay tasks in the mobile game Princess Connect, covering environment setup, device interaction, screenshot handling, image template matching, cropping, and OCR for extracting in‑game information.

ADBAndroid automationMobile Development

0 likes · 9 min read

Automating Princess Connect on Android with Python, ADB, and OpenCV

Youzan Coder

Nov 5, 2021 · Artificial Intelligence

AI-Powered Image Recognition for Fresh Produce Retail: System Design and Implementation

An AI‑driven image‑recognition system using TensorFlow Lite cameras on checkout scales replaces barcode PLU lookup with hierarchical product categories, caches offline selections for incremental model updates, and delivers instant, offline‑capable identification, dramatically speeding fresh produce checkout, cutting labor costs, and offering a reusable framework for other retail sectors.

.aiAutomationRetail

0 likes · 8 min read

AI-Powered Image Recognition for Fresh Produce Retail: System Design and Implementation

Python Crawling & Data Mining

Sep 21, 2021 · Artificial Intelligence

Deploy a Full-Stack Captcha Recognition System: From Vue Frontend to Python AI Model

This tutorial walks you through deploying a complete captcha labeling and recognition solution, covering Vue front‑end setup, Java back‑end packaging, and Python CNN model serving with Flask, complete with code snippets, configuration details, and deployment screenshots.

CNNDeploymentVue

0 likes · 7 min read

Deploy a Full-Stack Captcha Recognition System: From Vue Frontend to Python AI Model

Python Crawling & Data Mining

Sep 9, 2021 · Artificial Intelligence

How to Build a High‑Accuracy Image CAPTCHA Recognition System with CNNs

This article walks through the complete workflow of creating a robust image CAPTCHA labeling and recognition solution—from background concepts and motivation, through data collection, annotation, CNN model training, and deployment—highlighting practical lessons and code resources for Python and OpenCV.

CNNimage recognitionopencv

0 likes · 9 min read

How to Build a High‑Accuracy Image CAPTCHA Recognition System with CNNs

Tencent Cloud Developer

Jun 29, 2021 · Information Security

Tencent Cloud Object Storage Content Security: Comprehensive Multi-Modal Content Moderation Solution

Tencent Cloud Object Storage Content Security offers a comprehensive, multi‑modal moderation solution—leveraging YouTu Lab’s advanced image, video, audio and text analysis—to automatically detect and handle prohibited material across hundreds of violation types, providing one‑click task initiation, configurable callbacks, and visual tracking for platforms such as social media, online education, e‑commerce, and gaming.

AI content moderationAudio AnalysisTencent Cloud

0 likes · 6 min read

Tencent Cloud Object Storage Content Security: Comprehensive Multi-Modal Content Moderation Solution

Baidu Geek Talk

Jun 21, 2021 · Artificial Intelligence

Detecting Pornographic Videos with Dual‑Modal AI: Images + Audio

This article presents a technical overview of a multimodal AI framework that combines image and audio analysis to identify pornographic video content, detailing model architectures, feature extraction methods, and experimental results achieving 93.4% accuracy on a 3,000‑sample test set.

Audio AnalysisDeep LearningMultimodal AI

0 likes · 6 min read

Detecting Pornographic Videos with Dual‑Modal AI: Images + Audio

Youku Technology

Mar 12, 2021 · Mobile Development

Intelligent Component Testing Solution for Youku Mobile App

Youku’s intelligent component‑testing solution for its mobile app combines mock‑driven data factories, image‑recognition layout verification, and a data‑driven automation framework to dramatically cut regression effort, boost test stability, and now automates over 60% of component cases while covering more than 90% of frequently used UI components.

UI verificationcomponent automationimage recognition

0 likes · 10 min read

Intelligent Component Testing Solution for Youku Mobile App

Youku Technology

Mar 9, 2021 · Mobile Development

Design and Implementation of a Mobile Automation Testing Framework for Youku APP

The article describes how a three‑layer, cross‑platform mobile automation framework was designed and implemented for the Youku app, integrating driver, encapsulation, and test‑case layers with utilities, logging, image‑recognition and platform reporting to streamline regression testing, cut labor costs, and guide future enhancements.

Testing frameworkUI testingYouku app

0 likes · 9 min read

Design and Implementation of a Mobile Automation Testing Framework for Youku APP

Baidu Intelligent Testing

Jan 27, 2021 · Artificial Intelligence

Baidu Mini‑Program Online Quality Assurance System: AI‑Driven Automated Traversal, Page Anomaly Detection, and Cloud‑Phone Cluster

This article describes how Baidu built an end‑to‑end online quality‑assurance platform for its mini‑program ecosystem, leveraging AI‑powered automated traversal, intelligent page‑exception detection, and a scalable cloud‑phone cluster to identify red‑line issues, improve audit efficiency, and reduce manual effort.

.aicloud phoneimage recognition

0 likes · 20 min read

Baidu Mini‑Program Online Quality Assurance System: AI‑Driven Automated Traversal, Page Anomaly Detection, and Cloud‑Phone Cluster

Huawei Cloud Developer Alliance

Dec 10, 2020 · Artificial Intelligence

Can AI Revolutionize Waste Sorting? Market Trends, Challenges, and Fast‑Track Solutions

This article explores how AI is being applied to waste classification—from smart trash cans and autonomous garbage trucks to deep‑learning models—while highlighting data‑labeling hurdles, model selection pitfalls, and how platforms like Huawei Cloud ModelArts can streamline development.

.aiDeep LearningModelArts

0 likes · 6 min read

Can AI Revolutionize Waste Sorting? Market Trends, Challenges, and Fast‑Track Solutions

DataFunTalk

Dec 9, 2020 · Artificial Intelligence

WeChat Identify: From Object Detection to Large‑Scale Image Search – Technical Overview

This article details the evolution of WeChat’s Identify product, explaining its end‑to‑end image recognition pipeline—including object detection, multi‑label classification, mobile‑side detection, large‑scale retrieval, unsupervised clustering, and system architecture—while showcasing various application scenarios such as product, plant, and landmark recognition.

WeChatcomputer visionimage recognition

0 likes · 12 min read

WeChat Identify: From Object Detection to Large‑Scale Image Search – Technical Overview

21CTO

Nov 3, 2020 · Artificial Intelligence

How Does Image Recognition Work? A Simple Guide to Core Principles

This article explains the fundamental principles of image recognition, covering how images are converted to numeric arrays, processed by scanning matrix blocks, and matched against patterns to identify objects such as text, faces, cats, dogs, or mice.

AI basicsConvolutioncomputer vision

0 likes · 4 min read

How Does Image Recognition Work? A Simple Guide to Core Principles

Python Crawling & Data Mining

Jun 22, 2020 · Mobile Development

Automate WeChat’s “Poke” Feature with Python, Appium, and OpenCV in 30 Lines

This tutorial explains how to use Python, Appium, and OpenCV to automatically perform WeChat’s “punch” (拍一拍) action by locating a friend’s avatar via image recognition and simulating a double‑tap, all within roughly 30 lines of code.

AppiumPythonWeChat Automation

0 likes · 5 min read

Automate WeChat’s “Poke” Feature with Python, Appium, and OpenCV in 30 Lines

360 Quality & Efficiency

Apr 10, 2020 · Artificial Intelligence

Handling Android Permission Dialogs Using Template Matching and SIFT Feature Matching

The article describes a system that automates Android permission dialog handling by employing template matching and SIFT‑based image recognition, discusses their limitations, outlines the end‑to‑end workflow, and proposes future enhancements using OCR and BERT for intelligent button selection.

AndroidAutomationPermission Dialog

0 likes · 5 min read

Handling Android Permission Dialogs Using Template Matching and SIFT Feature Matching

Tencent Cloud Developer

Mar 30, 2020 · Information Security

How AI Powers Real-Time Content Moderation for Live Streams

With the surge in online content, Tencent Cloud’s content security team outlines a multi‑layered AI approach—ranging from MD5 matching to deep‑learning multi‑label and fine‑grained image analysis, audio VAD and speech models, and adaptive text filtering—to detect and mitigate unsafe live‑stream material.

.aiAudio DetectionText Filtering

0 likes · 17 min read

How AI Powers Real-Time Content Moderation for Live Streams

Huajiao Technology

Mar 3, 2020 · Mobile Development

Why UI Automation Matters for Mobile Apps and Using Appium with Cucumber

This article explains why UI automation testing is crucial for complex mobile apps, introduces Appium as a cross‑platform open‑source solution, demonstrates organizing test cases with Cucumber and Page Object patterns, details element locating strategies, custom steps, workflow architecture, and discusses current limitations and improvement plans.

AppiumCucumberPage Object

0 likes · 18 min read

Why UI Automation Matters for Mobile Apps and Using Appium with Cucumber

360 Quality & Efficiency

Jan 2, 2020 · Mobile Development

Common Element Locating Strategies in Appium for Mobile Automation

This article introduces Appium's basic element locating techniques—including id, name, class name, XPath, UIAutomator, and relative coordinates—explains how to handle non‑unique elements through iteration or OCR, and demonstrates image‑based locating with OpenCV and screenshot code examples.

AppiumElement LocatingOCR

0 likes · 5 min read

Common Element Locating Strategies in Appium for Mobile Automation

Tencent Cloud Developer

Dec 26, 2019 · Artificial Intelligence

WeChat Scan-to-Identify (Scan Object) Feature: Overview, Technical Architecture, Data Construction, and Algorithmic Advances

WeChat’s iOS Scan‑to‑Identify feature lets users point a camera at any product or scene to instantly retrieve related e‑commerce, encyclopedia or news content, using a four‑pipeline architecture that builds massive annotated and deduplicated databases, advanced RetinaNet‑based detection, multi‑task metric learning, and scalable training, deployment and scheduling platforms, with plans to extend into domains like facial, vehicle and plant recognition.

.aiWeChatcomputer vision

0 likes · 34 min read

WeChat Scan-to-Identify (Scan Object) Feature: Overview, Technical Architecture, Data Construction, and Algorithmic Advances

Python Programming Learning Circle

Oct 16, 2019 · Mobile Development

Automate Android Game Card Pulls with ADB and Image Recognition

This guide explains how to build an Android automation script that uses ADB screenshots and OpenCV template matching to detect and repeatedly pull cards in a game, handling start, SSR/SR detection, and completion without manual interaction.

ADBAndroidGame Scripting

0 likes · 3 min read

Automate Android Game Card Pulls with ADB and Image Recognition

Tencent Cloud Developer

Sep 19, 2019 · Artificial Intelligence

Inside Tencent Cloud OCR: Architecture, Performance, and Integration Guide

The article provides a comprehensive overview of Tencent Cloud’s OCR platform, detailing its service architecture, product capabilities, integration methods, performance metrics, engineering improvements, testing automation, and operational considerations, offering developers practical insights into building and deploying OCR solutions on the cloud.

OCRService ArchitectureTencent Cloud

0 likes · 10 min read

Inside Tencent Cloud OCR: Architecture, Performance, and Integration Guide

360 Quality & Efficiency

Jun 28, 2019 · Operations

Using Sikuli for GUI Automation: Installation, Python Integration, and Practical Tips

This article introduces Sikuli, an image‑based GUI automation tool, explains its origins, provides download links, details installation steps, demonstrates Python integration via the Lackey library and SikuliX API, shares useful code snippets, and highlights common pitfalls and overall considerations for test automation.

GUI automationLackeyPython

0 likes · 6 min read

Using Sikuli for GUI Automation: Installation, Python Integration, and Practical Tips

Tencent Cloud Developer

Apr 16, 2019 · Artificial Intelligence

Building Image Recognition Systems: From Basics to Advanced AI Techniques

This article summarizes a computer‑vision salon where Dr. Ji Yongnan explains imaging pipelines, traditional feature‑based methods, deep‑learning breakthroughs, Tencent Cloud AI services, real‑world case studies, and answers audience questions about machine‑vision versus computer‑vision and data‑scarcity challenges.

AI ApplicationsDeep LearningSegmentation

0 likes · 18 min read

Building Image Recognition Systems: From Basics to Advanced AI Techniques

iQIYI Technical Product Team

Dec 28, 2018 · Artificial Intelligence

AI‑Driven Visual Automation Testing Frameworks: Challenges, Opportunities, and the Aion Solution

The article examines shortcomings of traditional visual automation frameworks—weak cross‑platform support, ID dependence, and fragile screenshot matching—and shows how Aion’s hybrid approach, merging image‑processing segmentation with deep‑learning classification and OCR, delivers a more stable, cross‑platform, “visible‑to‑obtain” testing solution while acknowledging remaining accuracy challenges.

AI testingOCRUI2Code

0 likes · 11 min read

AI‑Driven Visual Automation Testing Frameworks: Challenges, Opportunities, and the Aion Solution

360 Quality & Efficiency

Nov 23, 2018 · Artificial Intelligence

Using Image Recognition for UI Automation with Sikuli: Principles, Functions, and Code Examples

This article explains how image‑recognition techniques, particularly via the Sikuli tool, can be applied to UI automation testing, covering underlying principles, a comprehensive list of built‑in functions, sample code snippets, and the advantages and limitations of this approach.

JythonSikuliTesting

0 likes · 7 min read

Using Image Recognition for UI Automation with Sikuli: Principles, Functions, and Code Examples

Java Captain

Sep 26, 2018 · Artificial Intelligence

Step-by-Step Guide to Using Baidu OCR API with Java

This article provides a comprehensive Java tutorial for accessing Baidu's OCR service, covering prerequisite setup, Maven dependencies, token acquisition, image-to‑Base64 conversion, HTTP request construction, and performance observations for Chinese, English, and mixed‑language image recognition.

APIAccess TokenBaidu OCR

0 likes · 9 min read

Step-by-Step Guide to Using Baidu OCR API with Java

360 Tech Engineering

May 17, 2018 · Artificial Intelligence

Applying Image Recognition in UI Automation Testing with Sikuli

This article introduces how image‑recognition techniques, particularly using the Sikuli tool, can be applied to UI automation testing for both web and mobile applications, covering practical scenarios, core principles, a suite of useful functions, example code, and the advantages and limitations of the approach.

SikuliTestingUI automation

0 likes · 7 min read

Applying Image Recognition in UI Automation Testing with Sikuli

360 Zhihui Cloud Developer

May 17, 2018 · Operations

Boost UI Test Automation with Sikuli’s Image Recognition: A Practical Guide

This article explains how image recognition can enhance UI automation testing for web and mobile applications, introduces Sikuli as a tool, details its core functions, provides code examples, and discusses the advantages and limitations of using visual‑based testing approaches.

JythonSikuliTesting

0 likes · 8 min read

Boost UI Test Automation with Sikuli’s Image Recognition: A Practical Guide

360 Quality & Efficiency

May 16, 2018 · Fundamentals

Applying Image Recognition in UI Automation Testing with Sikuli

This article introduces the use of image‑recognition techniques, particularly the Sikuli tool, for UI automation testing, covering typical scenarios, underlying principles, key functions such as Find, click, wait, and type, as well as example code, and discusses the advantages and limitations of this approach.

JythonSikuliUI automation

0 likes · 7 min read

AntTech

Mar 23, 2018 · Artificial Intelligence

Technical Overview of Alipay's 2018 Spring Festival “Scan Fu” Image Recognition System

The article details Alipay's 2018 Spring Festival "Scan Fu" initiative, describing the challenges of high‑volume Chinese character detection, the client‑server architecture, the lightweight xFuNet deep‑learning model, training strategies, performance results, and future AR extensions.

AlipayDeep Learningclient-server

0 likes · 9 min read

Technical Overview of Alipay's 2018 Spring Festival “Scan Fu” Image Recognition System

Ctrip Technology

Mar 22, 2018 · Artificial Intelligence

Poetry Generation from Images: Design, Implementation, and Evaluation of Ctrip’s “Xiao Shi Ji” System

The article presents Ctrip’s “Xiao Shi Ji” system that combines large‑scale tourism knowledge graphs, image recognition, and deep‑learning‑based poetry generation to automatically compose Chinese classical poems from photos, evaluates its performance against human poets, and discusses the underlying AI techniques.

Poetry Generationimage recognition

0 likes · 14 min read

Poetry Generation from Images: Design, Implementation, and Evaluation of Ctrip’s “Xiao Shi Ji” System

21CTO

Jan 6, 2018 · Artificial Intelligence

How Image Recognition Transforms Our World: Principles, Processes, and Future

This article explains the fundamentals of image recognition technology, its underlying principles, processing steps, neural‑network and nonlinear‑dimensionality‑reduction approaches, and highlights its wide‑range applications and future potential across many industries.

.aicomputer visiondimensionality reduction

0 likes · 11 min read

How Image Recognition Transforms Our World: Principles, Processes, and Future

21CTO

Dec 19, 2017 · Artificial Intelligence

How Deep Neural Networks Decode Images: From CNNs to RNNs

This article explains the fundamental principles behind deep neural networks for image recognition, covering convolutional and recurrent architectures, their training processes, feature extraction mechanisms, and the emerging ability to generate automatic image captions.

Deep LearningRecurrent Neural Networkconvolutional neural network

0 likes · 13 min read

How Deep Neural Networks Decode Images: From CNNs to RNNs

Baidu Intelligent Testing

Oct 27, 2017 · Mobile Development

From Zero to a Universal Android Script Testing Solution: Mixed‑Script Automation, Image‑Recognition, and Recording Tools

The article details how Baidu MTC designed and implemented a universal Android script testing platform that combines UIAutomator, a custom Clean‑SDK for popup handling, image‑recognition algorithms, and a recording‑playback tool to enable robust, non‑native mobile automated testing across thousands of devices.

AndroidScript RecordingUIAutomator

0 likes · 12 min read

From Zero to a Universal Android Script Testing Solution: Mixed‑Script Automation, Image‑Recognition, and Recording Tools

Architecture Digest

Sep 30, 2017 · Artificial Intelligence

Overview of Prominent Deep Learning Architectures for Computer Vision

This article surveys recent progress in deep learning by presenting key computer‑vision architectures such as AlexNet, VGG, GoogleNet, ResNet, ResNeXt, RCNN, YOLO, SqueezeNet, SegNet and GANs, providing brief descriptions, their advantages, and links to original papers and Keras implementations.

Deep LearningKerascomputer vision

0 likes · 16 min read

Overview of Prominent Deep Learning Architectures for Computer Vision

Qunar Tech Salon

Dec 5, 2016 · Artificial Intelligence

Understanding Convolutional Neural Networks for OCR and CAPTCHA Recognition

This article introduces the fundamentals of neural networks for image recognition, explains regression vs classification, describes convolution, pooling and fully connected layers, illustrates the classic LeNet‑5 model on the MNIST dataset, and shows how a TensorFlow‑based CNN can be trained to recognize CAPTCHA images, achieving high accuracy.

CNNLeNet-5OCR

0 likes · 10 min read

Understanding Convolutional Neural Networks for OCR and CAPTCHA Recognition

360 Quality & Efficiency

May 12, 2016 · Fundamentals

Introduction to Sikuli: Image‑Based UI Automation Tool, Installation, Usage, and Demo

This article introduces Sikuli, an image‑recognition based UI automation tool, explains its advantages over traditional UI testing, details supported operating systems and required prerequisites, guides through installation steps, and demonstrates a simple Sikuli script for automating basic tasks.

PythonSikuliSikuliX

0 likes · 6 min read

Introduction to Sikuli: Image‑Based UI Automation Tool, Installation, Usage, and Demo

Baidu Intelligent Testing

Apr 21, 2016 · Mobile Development

Integrating OpenCV with Appium for Automated Game Testing on Mobile Devices

This article describes how the MMGame testing team combined the open‑source Appium automation framework with OpenCV's image‑recognition capabilities to enable coordinate‑based testing of third‑party mobile games that lack accessible UI elements, detailing the workflow, implementation, results, and a comparison with other mobile testing tools.

AkazeAppiumgame testing

0 likes · 16 min read

Integrating OpenCV with Appium for Automated Game Testing on Mobile Devices

Ctrip Technology

Jun 19, 2015 · Artificial Intelligence

Bank Card Scanning and Recognition Project Overview

This article describes a mobile payment‑focused bank card OCR project that extends an open‑source solution to support Chinese 19‑digit debit cards by introducing new algorithms for vertical coordinate detection, background filtering, single‑character recognition, and Luhn‑based checksum validation.

.aiLuhn algorithmbank card OCR

0 likes · 7 min read

Bank Card Scanning and Recognition Project Overview

Baidu Tech Salon

May 9, 2014 · Artificial Intelligence

Connecting People and Services Through Visual Recognition: Insights from Baidu's Tech Salon

At Baidu’s Xierqi Night Talk, senior developers learned how the company’s new “Light Tap” visual‑recognition platform and open cloud services aim to link people with everyday services through camera‑based interactions, positioning image recognition as the leading O2O connection method over QR codes, NFC, and voice.

Artificial IntelligenceBaidu technologyCloud Computing

0 likes · 10 min read

Connecting People and Services Through Visual Recognition: Insights from Baidu's Tech Salon