Tagged articles
108 articles
Page 1 of 2
AIWalker
AIWalker
Mar 16, 2026 · Artificial Intelligence

DETR Drops Hungarian Matching: Double Training Speed, +4.2 AP on Large Objects

Beyond-Hungarian replaces the costly Hungarian assignment in DETR with a differentiable, query‑free matching scheme that halves training latency, boosts large‑object AP by 4.2 points, and introduces a GT‑Probe module and dual‑loss framework, while detailing trade‑offs, ablations, and future challenges.

DETRGT-ProbeHungarian matching
0 likes · 14 min read
DETR Drops Hungarian Matching: Double Training Speed, +4.2 AP on Large Objects
AIWalker
AIWalker
Mar 11, 2026 · Artificial Intelligence

Why 90% of DETR Queries Stay Idle and How PaQ‑DETR Boosts mAP by 4.2%

The article dissects the query‑activation imbalance in DETR‑based detectors, explains PaQ‑DETR’s pattern‑sharing and quality‑aware assignment mechanisms, and shows how these jointly raise detection mAP by up to 4.2% on COCO with less than 5% extra FLOPs.

DETRPaQ-DETRobject detection
0 likes · 15 min read
Why 90% of DETR Queries Stay Idle and How PaQ‑DETR Boosts mAP by 4.2%
Code Mala Tang
Code Mala Tang
Mar 5, 2026 · Artificial Intelligence

Master YOLOv12: A Step‑by‑Step Guide to Build, Train, and Deploy Custom Models

This tutorial walks readers through the fundamentals of YOLOv12, covering model variants, dataset preparation with Roboflow, optional FlashAttention acceleration, installation, model selection, training commands, post‑training tasks such as tracking, validation, inference, exporting to ONNX, and benchmarking, all with concrete code snippets and practical tips.

Computer VisionFlashAttentionModel Training
0 likes · 8 min read
Master YOLOv12: A Step‑by‑Step Guide to Build, Train, and Deploy Custom Models
Code Mala Tang
Code Mala Tang
Mar 1, 2026 · Artificial Intelligence

Why YOLO Dominates Real-Time Object Detection: A Complete Guide

This article provides a comprehensive overview of the YOLO (You Only Look Once) algorithm, explaining its core principles, architecture, version history, training workflow, real‑world applications, strengths, and current limitations for modern computer‑vision tasks.

Computer VisionDeep LearningReal-Time
0 likes · 9 min read
Why YOLO Dominates Real-Time Object Detection: A Complete Guide
AI Frontier Lectures
AI Frontier Lectures
Jan 15, 2026 · Artificial Intelligence

What Makes YOLO26 the Next Leap in Edge AI Object Detection?

YOLO26, the latest Ultralytics release, introduces a unified model family with five sizes, removes distribution focal loss, offers end‑to‑end inference without NMS, adds progressive loss balancing and the MuSGD optimizer, and delivers up to 43% faster CPU performance, making it ideal for edge and real‑world vision applications.

Model OptimizationYOLO26edge AI
0 likes · 12 min read
What Makes YOLO26 the Next Leap in Edge AI Object Detection?
Liangxu Linux
Liangxu Linux
Nov 6, 2025 · Artificial Intelligence

8 Must‑Explore Open‑Source Projects: AI Prompt Tools, Voice Transcription, Browser Engine & More

This article introduces eight noteworthy open‑source projects—including an interactive prompt‑engineering tutorial, Claude Cookbooks, an offline speech‑to‑text tool, an eBook‑to‑audiobook converter, the Servo browser engine, a free programming‑books collection, a real‑time object‑detection model, and other popular repositories—each with brief descriptions and GitHub links.

AI toolsGitHubPrompt engineering
0 likes · 7 min read
8 Must‑Explore Open‑Source Projects: AI Prompt Tools, Voice Transcription, Browser Engine & More
HyperAI Super Neural
HyperAI Super Neural
Sep 29, 2025 · Artificial Intelligence

8 Popular Remote Sensing Object Detection Datasets with One-Click Downloads

This article presents a curated list of eight widely used remote sensing object detection datasets covering indoor scenes, landslides, drone imagery, crop diseases, safety vests, human fractures, urban issues, and plant diseases, each with size estimates and direct download links for researchers.

Computer VisionDatasetsai
0 likes · 10 min read
8 Popular Remote Sensing Object Detection Datasets with One-Click Downloads
AIWalker
AIWalker
Sep 24, 2025 · Artificial Intelligence

Top 2025 Object Detection Research Paths: From Grounding DINO 1.5 to Open‑Set Breakthroughs

The article outlines four key innovation avenues—architecture redesign, task expansion, information fusion, and paradigm shift—highlighting recent works such as Mr. DETR, Grounding DINO 1.5, SM3Det, and RoboFusion, and offers a curated list of 176 cutting‑edge object‑detection papers with code and datasets for free.

Deep LearningModel architectureobject detection
0 likes · 8 min read
Top 2025 Object Detection Research Paths: From Grounding DINO 1.5 to Open‑Set Breakthroughs
php Courses
php Courses
Aug 22, 2025 · Backend Development

How to Use PHP’s is_object() to Distinguish Objects from Other Types

This article explains PHP’s is_object() function, detailing its syntax, parameters, and return values, and demonstrates through code examples how to check whether variables such as objects and arrays are objects, helping developers avoid type errors at runtime.

PHPis_objectobject detection
0 likes · 3 min read
How to Use PHP’s is_object() to Distinguish Objects from Other Types
AIWalker
AIWalker
Aug 19, 2025 · Artificial Intelligence

Easy Ways to Boost YOLO: Systematic Review of Versions and Use Cases

This article systematically reviews every YOLO version, classifies five major improvement directions—architecture enhancements, efficiency optimizations, multi‑task learning, temporal modeling, and domain‑specific customizations—provides concrete paper references, code links, and dataset resources to help researchers and engineers quickly locate and apply the most effective techniques.

Deep LearningYOLOmodel improvement
0 likes · 8 min read
Easy Ways to Boost YOLO: Systematic Review of Versions and Use Cases
Amap Tech
Amap Tech
Jul 14, 2025 · Artificial Intelligence

How UPRE Achieves Zero-Shot Domain Adaptation for Object Detection with Unified Prompts

The UPRE paper, presented at ICCV, introduces a multi‑view domain prompt and a unified representation enhancement to enable zero‑shot domain adaptation for object detection, achieving state‑of‑the‑art performance across diverse weather, geographic, and synthetic‑to‑real scenarios.

Computer VisionPrompt engineeringobject detection
0 likes · 10 min read
How UPRE Achieves Zero-Shot Domain Adaptation for Object Detection with Unified Prompts
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jun 27, 2025 · Artificial Intelligence

Image Encryption, Watermarking, Detection & Green Screen Removal in Python

This tutorial walks through Python-based computer‑vision techniques—including XOR‑based image encryption, mask and ROI methods, digital watermark embedding via bit‑plane and LSB, sensitivity‑driven object detection, and HSV‑based green‑screen removal—providing complete code snippets and practical guidance for rapid AI‑assisted learning.

Computer VisionOpenCVPython
0 likes · 17 min read
Image Encryption, Watermarking, Detection & Green Screen Removal in Python
AIWalker
AIWalker
May 26, 2025 · Artificial Intelligence

VisionReasoner: RL‑Unified Model Beats YOLO‑World Detection, Segmentation, Counting

VisionReasoner presents a reinforcement‑learning‑driven unified framework that simultaneously tackles detection, segmentation, and counting tasks, employing a novel multi‑target cognition strategy and efficient Hungarian‑based matching, and demonstrates substantial gains—29.1% on COCO detection, 22.1% on ReasonSeg, and 15.3% on CountBench—using only 7,000 training samples.

Reinforcement LearningSegmentationVisionReasoner
0 likes · 20 min read
VisionReasoner: RL‑Unified Model Beats YOLO‑World Detection, Segmentation, Counting
AIWalker
AIWalker
May 22, 2025 · Artificial Intelligence

VisionReasoner: RL‑Unified System Beats YOLO‑World on Detection, Segmentation, Counting

VisionReasoner introduces a reinforcement‑learning‑driven unified framework that simultaneously handles detection, segmentation, and counting tasks within a single model, achieving 29.1% higher COCO detection AP, 22.1% better ReasonSeg segmentation, and 15.3% improvement on CountBench, while requiring only 7,000 training samples and offering efficient multi‑target matching via batch computation and the Hungarian algorithm.

LVLMObject CountingReinforcement Learning
0 likes · 19 min read
VisionReasoner: RL‑Unified System Beats YOLO‑World on Detection, Segmentation, Counting
AIWalker
AIWalker
May 18, 2025 · Artificial Intelligence

YOLOE: Open‑Source Real‑Time Anything Detector Beats YOLO‑World v2

YOLOE unifies object detection and segmentation in a single efficient model that supports text, visual, and prompt‑free inference, introduces RepRTA, SAVPE, and LRPC strategies, and achieves higher AP with up to three‑fold lower training cost and 1.4× faster inference on GPUs and mobile devices, as demonstrated by extensive LVIS and COCO experiments.

Computer VisionPrompt engineeringReal-Time
0 likes · 29 min read
YOLOE: Open‑Source Real‑Time Anything Detector Beats YOLO‑World v2
AIWalker
AIWalker
May 14, 2025 · Artificial Intelligence

How HGO‑YOLO Achieves 87.4% Accuracy at 56 FPS with Only 4.6 MB Parameters

This paper presents HGO‑YOLO, a lightweight real‑time anomaly‑behavior detector that integrates HGNetv2 and GhostConv into YOLOv8, achieving 87.4% mAP with just 4.6 MB of parameters and 56 FPS on CPU, and validates its performance across multiple datasets and hardware platforms.

Computer VisionLightweight ModelsYOLO
0 likes · 25 min read
How HGO‑YOLO Achieves 87.4% Accuracy at 56 FPS with Only 4.6 MB Parameters
AIWalker
AIWalker
Mar 13, 2025 · Artificial Intelligence

YOLOE: Real‑Time Open‑World Object Detection and Segmentation Unveiled

The paper introduces YOLOE, a new YOLO‑based model that supports text, visual, and no‑prompt open‑world detection and segmentation, detailing its lightweight RepRTA, SAVPE, and LRPC modules and showing benchmark gains in speed and zero‑shot performance on LVIS and COCO.

Computer VisionYOLOEbenchmark
0 likes · 9 min read
YOLOE: Real‑Time Open‑World Object Detection and Segmentation Unveiled
AIWalker
AIWalker
Mar 1, 2025 · Artificial Intelligence

Lightweight Remote Sensing Backbone LSKNet and Strip R-CNN: Design, Benchmarks, and Open‑Source Release

The NK‑Remote repository introduces LSKNet and Strip R‑CNN, two lightweight yet powerful models for remote‑sensing object detection that dynamically adjust receptive fields and combine square‑and‑strip convolutions, achieving state‑of‑the‑art performance on benchmarks such as DOTA, FAIR1M, HRSC2016, and DIOR.

Deep LearningJDetLSKNet
0 likes · 9 min read
Lightweight Remote Sensing Backbone LSKNet and Strip R-CNN: Design, Benchmarks, and Open‑Source Release
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Feb 24, 2025 · Artificial Intelligence

Can Multi‑Teacher Distillation Overcome Catastrophic Forgetting in Continual Learning?

This paper proposes a multi‑teacher distillation framework for continual learning that combines active data rehearsal with feature‑decoupled distillation, demonstrating superior performance on PASCAL VOC and COCO benchmarks while mitigating catastrophic forgetting and balancing stability‑plasticity trade‑offs.

Catastrophic Forgettingactive rehearsalai
0 likes · 12 min read
Can Multi‑Teacher Distillation Overcome Catastrophic Forgetting in Continual Learning?
AIWalker
AIWalker
Feb 19, 2025 · Artificial Intelligence

YOLOv12 Unveiled: Boosted Performance and Speed for Real‑Time Detection

YOLOv12 introduces an attention‑centric architecture, a lightweight regional attention module, and the R‑ELAN aggregation network, delivering consistent mAP gains and lower latency across N, S, M, L and X model scales while surpassing previous YOLO versions and other real‑time detectors.

Attention MechanismComputer VisionReal-Time
0 likes · 8 min read
YOLOv12 Unveiled: Boosted Performance and Speed for Real‑Time Detection
Python Programming Learning Circle
Python Programming Learning Circle
Dec 19, 2024 · Artificial Intelligence

Overview of Microsoft’s Open‑Source Computer Vision Recipes Library

The article introduces Microsoft’s open‑source Computer Vision Recipes library, describing its purpose, target audience, repository links, supported vision scenarios such as image classification, similarity, detection, key‑point, segmentation, action recognition, multi‑object tracking and crowd counting, and provides guidance on using PyTorch, Azure and GPU resources.

AzureImage ClassificationPyTorch
0 likes · 7 min read
Overview of Microsoft’s Open‑Source Computer Vision Recipes Library
php Courses
php Courses
Dec 18, 2024 · Artificial Intelligence

Using PHP to Access the Camera and Perform Face Detection with OpenCV

This article explains how to install OpenCV and php-facedetect libraries, write PHP code to capture images from a webcam, perform face detection using the pico library, and display the results, providing a step‑by‑step guide for object detection with PHP.

CameraComputer VisionFace Detection
0 likes · 5 min read
Using PHP to Access the Camera and Perform Face Detection with OpenCV
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 22, 2024 · Artificial Intelligence

Understanding Faster R-CNN: Architecture, Training, and Experimental Results

This article provides an in‑depth overview of the Faster R‑CNN object detection framework, covering its background, key innovations such as the Region Proposal Network, detailed algorithmic principles, training procedures, experimental results on PASCAL VOC and MS COCO, and a reproducible PyTorch implementation.

Computer VisionDeep LearningFaster R-CNN
0 likes · 14 min read
Understanding Faster R-CNN: Architecture, Training, and Experimental Results
160 Technical Team
160 Technical Team
Jul 29, 2024 · Artificial Intelligence

How YOLO Transforms Medical Report Screening and Occlusion Detection

Leveraging the YOLO family of deep‑learning models, this study demonstrates efficient filtering of irrelevant medical images, accurate classification of textual reports, and robust detection of occluding objects, achieving high precision and speed on both CPU and GPU, while outlining training details, performance metrics, and future improvements.

Deep LearningYOLOmedical imaging
0 likes · 17 min read
How YOLO Transforms Medical Report Screening and Occlusion Detection
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
May 10, 2024 · Artificial Intelligence

Real-Time Dog Detection in Browser Using TensorFlow.js and MobileNet V2

This guide demonstrates how to build a web‑based real‑time dog detector that accesses the phone camera via the browser, processes video frames with TensorFlow.js and a pre‑trained COCO‑SSD MobileNet V2 model, and plays an audio alert when a dog is recognized, all deployed on an Android device using Termux.

AndroidMobileNetTensorFlow.js
0 likes · 8 min read
Real-Time Dog Detection in Browser Using TensorFlow.js and MobileNet V2
php Courses
php Courses
Apr 16, 2024 · Artificial Intelligence

Using PHP and OpenCV for Camera‑Based Object Detection

This tutorial explains how to install required libraries, write PHP code that captures images from a webcam, uses OpenCV and php‑facedetect to detect faces, and displays the results with annotated bounding boxes, providing a foundation for further object detection projects.

CameraComputer VisionFace Detection
0 likes · 6 min read
Using PHP and OpenCV for Camera‑Based Object Detection
Huolala Tech
Huolala Tech
Jan 25, 2024 · Artificial Intelligence

How Open‑Vocabulary Detection and Segment‑Anything Are Revolutionizing Visual AI at Huolala

This article reviews traditional computer‑vision tasks—classification, detection, and segmentation—highlights their limitations, introduces open‑vocabulary detection and segment‑anything models such as GLIP, Grounding DINO, and SAM, and details how Huolala applies these advances to driver‑license, packing, and vehicle‑sticker inspections for safer, more efficient AI‑driven operations.

Computer VisionSegmentationobject detection
0 likes · 20 min read
How Open‑Vocabulary Detection and Segment‑Anything Are Revolutionizing Visual AI at Huolala
DataFunTalk
DataFunTalk
Nov 24, 2023 · Artificial Intelligence

Open Vocabulary Detection Contest 2023: Summary of Winning Teams' Technical Solutions

The article reviews the Open Vocabulary Detection Contest organized by the Chinese Society of Image and Graphics and 360 AI Institute, describing the competition setup, dataset characteristics, and detailed winning approaches that combine Detic, CLIP, prompt learning, and multi‑stage pipelines to achieve strong few‑shot and zero‑shot object detection performance.

CLIPComputer Visioncompetition
0 likes · 17 min read
Open Vocabulary Detection Contest 2023: Summary of Winning Teams' Technical Solutions
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 8, 2023 · Artificial Intelligence

Why the Scale‑Aware Modulation Transformer Outperforms CNNs and Vision Transformers with Fewer Parameters

The Scale‑Aware Modulation Transformer (SMT) introduces a lightweight SAM module and an Evolutionary Hybrid Network that together achieve higher accuracy on ImageNet, COCO, and ADE20K while using significantly fewer parameters and FLOPs than existing CNN and Transformer baselines.

Image ClassificationSMTScale‑Aware Modulation
0 likes · 12 min read
Why the Scale‑Aware Modulation Transformer Outperforms CNNs and Vision Transformers with Fewer Parameters
Huolala Tech
Huolala Tech
Sep 28, 2023 · Artificial Intelligence

How Mobile AI Transforms Logistics: Real‑World Image Algorithms at Huolala

This article explores Huolala's deployment of mobile AI image algorithms for driver document verification and vehicle sticker inspection, detailing model design, lightweighting, hybrid processing, data stream handling, and on‑device deployment that boost efficiency, privacy, and real‑time performance in logistics operations.

Edge ComputingLogisticsMobile AI
0 likes · 13 min read
How Mobile AI Transforms Logistics: Real‑World Image Algorithms at Huolala
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 17, 2023 · Artificial Intelligence

Getting Started with YOLOv8 on the Ultralytics Platform: Installation, Command‑Line Usage, and Model Training

This article introduces the YOLOv8 object‑detection framework on the Ultralytics platform, covering environment setup, command‑line and Python APIs for inference, model‑file options, result interpretation, data annotation, training procedures, and exporting models to various deployment formats.

Computer VisionModel TrainingPython
0 likes · 14 min read
Getting Started with YOLOv8 on the Ultralytics Platform: Installation, Command‑Line Usage, and Model Training
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 20, 2023 · Artificial Intelligence

Open-Vocabulary Object Attribute Recognition with OvarNet: A Unified Framework for Detection and Attribute Classification

At CVPR 2023 the Xiaohongshu team presented OvarNet, a unified one‑stage Faster‑RCNN model built on CLIP that uses prompt learning and knowledge distillation to jointly detect objects and recognize open‑vocabulary attributes, achieving state‑of‑the‑art results on VAW, MS‑COCO, LSA and OVAD datasets.

Computer VisionMultimodal Learningattribute recognition
0 likes · 12 min read
Open-Vocabulary Object Attribute Recognition with OvarNet: A Unified Framework for Detection and Attribute Classification
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Jun 5, 2023 · Artificial Intelligence

How DETR and Its Successors Evolve: A Deep Dive into the DETR Series for Object Detection

This article reviews the original DETR model, analyzes its strengths and weaknesses, and then examines two major follow‑up works—Deformable‑DETR and DAB‑DETR—explaining how they modify attention mechanisms, introduce deformable convolutions and dynamic anchor boxes to accelerate convergence and improve small‑object detection.

DAB-DETRDETRDeformable-DETR
0 likes · 12 min read
How DETR and Its Successors Evolve: A Deep Dive into the DETR Series for Object Detection
DataFunTalk
DataFunTalk
Apr 25, 2023 · Artificial Intelligence

DAMO-YOLO: An Efficient Target Detection Framework with NAS, Multi‑Scale Fusion, and Full‑Scale Distillation

This article introduces DAMO‑YOLO, a high‑performance object detection framework that combines low‑cost model customization via MAE‑NAS, an Efficient RepGFPN with HeavyNeck for superior multi‑scale detection, and a full‑scale distillation technique, delivering faster inference, lower FLOPs, and higher accuracy across diverse industrial scenarios.

DistillationModel OptimizationNAS
0 likes · 15 min read
DAMO-YOLO: An Efficient Target Detection Framework with NAS, Multi‑Scale Fusion, and Full‑Scale Distillation
DataFunSummit
DataFunSummit
Apr 13, 2023 · Artificial Intelligence

ModelScope CV Model Overview: Visual Detection and Keypoint Applications

This article presents a comprehensive overview of ModelScope's computer‑vision models, detailing visual detection and keypoint solutions—including VitDet, YOLOX, res2net, HRNet, and 3D pose models—their architectures, performance highlights, real‑world applications, and future development plans.

AI modelsModelScopekeypoint detection
0 likes · 11 min read
ModelScope CV Model Overview: Visual Detection and Keypoint Applications
Baidu Tech Salon
Baidu Tech Salon
Apr 7, 2023 · Artificial Intelligence

Ambiguity-Resistant Semi-supervised Learning (ARSL) for Single-stage Object Detection

ARSL, an ambiguity‑resistant semi‑supervised learning framework for single‑stage object detection, introduces Joint‑Confidence Estimation and Task‑Separation Assignment to resolve selection and assignment ambiguities in pseudo‑labels, thereby markedly improving pseudo‑label quality and achieving state‑of‑the‑art AP gains on COCO benchmarks.

ARSLComputer VisionSemi-supervised Learning
0 likes · 8 min read
Ambiguity-Resistant Semi-supervised Learning (ARSL) for Single-stage Object Detection
Baidu Geek Talk
Baidu Geek Talk
Mar 16, 2023 · Artificial Intelligence

PaddleDetection v2.6 Release: PP-YOLOE Family Expansion and Advanced Detection Algorithms

PaddleDetection v2.6 expands the PP‑YOLOE family with rotating, small‑object, dense‑object, and ultra‑lightweight edge‑GPU models, upgrades PP‑Human and PP‑Vehicle toolboxes, releases semi‑supervised, few‑shot and distillation learning methods, adds numerous state‑of‑the‑art algorithms, and improves infrastructure with Python 3.10, EMA filtering and AdamW support.

BaiduComputer VisionDeep Learning
0 likes · 14 min read
PaddleDetection v2.6 Release: PP-YOLOE Family Expansion and Advanced Detection Algorithms
政采云技术
政采云技术
Mar 9, 2023 · Artificial Intelligence

Comprehensive Overview of Object Detection: From Traditional Methods to Modern Deep Learning Models

This article provides a comprehensive overview of object detection, describing traditional sliding‑window approaches, deep‑learning based two‑stage and one‑stage models such as R‑CNN, Faster R‑CNN, YOLO series, and discusses current challenges, improvement directions, and future research trends in the field.

Computer VisionDeep LearningR-CNN
0 likes · 29 min read
Comprehensive Overview of Object Detection: From Traditional Methods to Modern Deep Learning Models
Meituan Technology Team
Meituan Technology Team
Mar 2, 2023 · Artificial Intelligence

Technical Innovations in YOLOv6 3.0 for Real‑Time Object Detection

YOLOv6 3.0 raises real‑time object detection performance to a new peak with 57.2% AP and 29 FPS on a T4 GPU, surpassing YOLOv7‑E6E, and introduces RepBi‑PAN Neck, Anchor‑Aided Training, and Decoupled Location Distillation to boost accuracy and efficiency.

Anchor-Aided TrainingDecoupled Location DistillationRepBi-PAN
0 likes · 13 min read
Technical Innovations in YOLOv6 3.0 for Real‑Time Object Detection
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 19, 2022 · Artificial Intelligence

How AI Transforms Football Video Analysis: Detection, Tracking, and Event Recognition

This article explores how artificial intelligence techniques such as deep learning, object detection, multi‑object tracking, and coordinate projection are applied to football video analysis to automatically detect the ball and players, map their positions onto the field, and recognize key events like shots and goals.

Computer VisionSports Analyticsai
0 likes · 16 min read
How AI Transforms Football Video Analysis: Detection, Tracking, and Event Recognition
ELab Team
ELab Team
Dec 6, 2022 · Artificial Intelligence

Mastering CreateML: From Data Prep to Object Detection Models on iOS

This article introduces Apple’s CreateML tool, explains its supported model types, shows how to prepare and augment data, provides a Node.js script for generating synthetic training sets, and walks through training, testing, and integrating an object‑detection model into an iOS app.

CreateMLSwiftdata augmentation
0 likes · 17 min read
Mastering CreateML: From Data Prep to Object Detection Models on iOS
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 12, 2022 · Artificial Intelligence

Unlock Vision AI: How EasyCV Streamlines Datasets and Model Training

This article introduces EasyCV, an open‑source all‑in‑one visual algorithm platform that abstracts diverse data sources, provides SOTA self‑supervised models, and offers ready‑to‑download datasets for image classification, object detection, segmentation, and pose estimation, complete with configuration examples.

Computer VisionDatasetsDeep Learning
0 likes · 9 min read
Unlock Vision AI: How EasyCV Streamlines Datasets and Model Training
Laiye Technology Team
Laiye Technology Team
Sep 28, 2022 · Artificial Intelligence

Checkbox Detection and State Classification Using YOLOv5

This article describes a comprehensive solution for detecting checkboxes in document images and determining their selected or unselected status by combining YOLOv5 object detection, synthetic and semi‑synthetic data generation, specialized post‑processing, and association logic to handle varied shapes, positions, and markings.

YOLOv5checkbox detectiondata synthesis
0 likes · 13 min read
Checkbox Detection and State Classification Using YOLOv5
政采云技术
政采云技术
Aug 11, 2022 · Artificial Intelligence

Semi‑Automatic Annotation with Label Studio and YOLOv5: Installation, Project Setup, and Model Training

This guide explains how to combine the open‑source labeling platform Label Studio with the YOLOv5 object‑detection model to achieve semi‑automatic annotation, covering installation of both tools, project creation, dataset configuration, and training a custom YOLOv5 model on your own data.

Label StudioPythonSemi-Automatic Annotation
0 likes · 11 min read
Semi‑Automatic Annotation with Label Studio and YOLOv5: Installation, Project Setup, and Model Training
Meituan Technology Team
Meituan Technology Team
Jun 23, 2022 · Artificial Intelligence

YOLOv6: An Efficient Industrial Object Detection Framework

YOLOv6, developed by Meituan's Vision Intelligence team, introduces a hardware‑friendly backbone, an efficient decoupled head, and advanced training strategies that together achieve up to 35.0% AP at 1242 FPS on COCO while outperforming YOLOv5, YOLOX and other same‑size models across multiple deployment platforms.

SIoU lossSimOTAYOLOv6
0 likes · 15 min read
YOLOv6: An Efficient Industrial Object Detection Framework
Code DAO
Code DAO
Dec 22, 2021 · Artificial Intelligence

How Context R-CNN Leverages Temporal Context to Detect Occluded Objects

The article reviews the Context R-CNN paper, which introduces short‑term and long‑term memory banks and an attention mechanism to incorporate temporal context from multiple frames captured by a fixed camera, enabling robust detection of partially occluded, low‑light, distant, or background‑cluttered objects, and shows quantitative gains over standard Faster R‑CNN.

Attention MechanismContext R-CNNFaster R-CNN
0 likes · 6 min read
How Context R-CNN Leverages Temporal Context to Detect Occluded Objects
Code DAO
Code DAO
Nov 30, 2021 · Artificial Intelligence

How to Train a Custom Object Detector with PyTorch Faster R‑CNN

This article provides a step‑by‑step guide to building, training, and evaluating a custom object detection model using PyTorch Faster R‑CNN on a microcontroller dataset, covering data preparation, configuration, model modification, training loops, loss visualization, and inference on new images.

Faster R-CNNPyTorchPython
0 likes · 23 min read
How to Train a Custom Object Detector with PyTorch Faster R‑CNN
Python Programming Learning Circle
Python Programming Learning Circle
Nov 8, 2021 · Artificial Intelligence

YOLOv5 Tutorial: From YOLOv3 to YOLOv5, Code Walkthrough, Model Export (JIT & ONNX) and Usage

This article provides a comprehensive guide on YOLOv5, covering its background from YOLOv3, detailed code analysis of the model architecture, step‑by‑step instructions for running detect.py, configuring yolov5s.yaml, exporting the model to TorchScript JIT and ONNX formats, and practical inference examples using PyTorch and ONNX Runtime.

JITONNXPyTorch
0 likes · 16 min read
YOLOv5 Tutorial: From YOLOv3 to YOLOv5, Code Walkthrough, Model Export (JIT & ONNX) and Usage
Youku Technology
Youku Technology
Jul 8, 2021 · Artificial Intelligence

Key Findings from Alibaba Moku Lab at ACM MM 2021

At ACM MM 2021, Alibaba’s Moku Lab presented four cutting‑edge studies: an interactive video inpainting system using user doodles, a decoupled IoU regression model for object detection, a spatio‑temporal distortion‑aware video quality assessment framework, and a multimodal emotional relationship recognition dataset and benchmark.

Computer VisionVideo Inpaintingmultimodal emotion recognition
0 likes · 8 min read
Key Findings from Alibaba Moku Lab at ACM MM 2021
Miss Fresh Tech Team
Miss Fresh Tech Team
Jul 8, 2021 · Artificial Intelligence

How AI Powers Smart Vending Cabinets: From RFID to Deep Learning Detection

This article details the evolution of intelligent vending cabinets, comparing RFID, gravity, dynamic and static vision solutions, and explains how deep‑learning models, data pipelines, and system architectures enable high‑accuracy, low‑loss product detection and automated operations in modern unmanned retail.

Computer VisionNeural NetworksSmart Vending
0 likes · 36 min read
How AI Powers Smart Vending Cabinets: From RFID to Deep Learning Detection
TiPaiPai Technical Team
TiPaiPai Technical Team
Jul 2, 2021 · Artificial Intelligence

How ContourNet and CenterNet Revolutionize Text Detection

This article explains the challenges of scene text detection and introduces two state‑of‑the‑art models, ContourNet and CenterNet, detailing their architectural innovations, loss functions, and how they overcome issues like extreme aspect ratios and anchor‑based inefficiencies.

CenterNetComputer VisionContourNet
0 likes · 7 min read
How ContourNet and CenterNet Revolutionize Text Detection
Alimama Tech
Alimama Tech
May 20, 2021 · Artificial Intelligence

How Alibaba’s AI Powers Brand Risk Detection: Models, Data, and Results

This article details Alibaba's AliMama brand risk identification system, covering the challenges of counterfeit detection, the construction of large‑scale brand datasets, the design of classification, logo detection, and variation models, their optimization, evaluation metrics, and future directions for AI‑driven brand protection.

AlibabaComputer VisionDeep Learning
0 likes · 22 min read
How Alibaba’s AI Powers Brand Risk Detection: Models, Data, and Results
360 Tech Engineering
360 Tech Engineering
Apr 16, 2021 · Artificial Intelligence

Applying YOLOv5 Object Detection for Black, Color, and Normal Screen Classification in Video Frames

This article presents a method that replaces traditional manual video frame quality checks with an automated YOLOv5‑based object detection pipeline, detailing data labeling, model training, loss computation, inference code, and experimental results that show higher accuracy than ResNet for classifying black, color‑screen, and normal frames.

Image ClassificationPythonYOLOv5
0 likes · 12 min read
Applying YOLOv5 Object Detection for Black, Color, and Normal Screen Classification in Video Frames
360 Quality & Efficiency
360 Quality & Efficiency
Apr 16, 2021 · Artificial Intelligence

Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames

This article presents a method that replaces manual visual inspection with an automated YOLOv5‑based object detection pipeline to classify video frames as normal, colorful, or black screens, detailing data annotation, training, loss calculation, inference code, and showing a 97% accuracy improvement over ResNet.

Computer VisionDeep LearningImage Classification
0 likes · 11 min read
Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames
58 Tech
58 Tech
Mar 24, 2021 · Artificial Intelligence

Automated Detection of Illegal Watermarks in Images Using Deep Learning at 58.com

This article describes how 58.com built an end‑to‑end deep‑learning watermark detection service, covering business needs, data collection and augmentation, model selection and iterative improvements (Faster‑RCNN, SSD, YOLOv3, anchor‑free methods), deployment results, and future research directions.

Computer VisionImage ModerationModel Optimization
0 likes · 14 min read
Automated Detection of Illegal Watermarks in Images Using Deep Learning at 58.com
New Oriental Technology
New Oriental Technology
Nov 9, 2020 · Artificial Intelligence

Understanding YOLOv4 and YOLOv5: Core Elements and Innovations in Object Detection

This article introduces the fundamentals of object detection, explains the latest YOLOv4 and YOLOv5 architectures, and details the essential components—including data preparation, regularization, backbone, neck, and prediction innovations—along with label smoothing and advanced loss functions for improved detection performance.

Computer VisionYOLOv4YOLOv5
0 likes · 9 min read
Understanding YOLOv4 and YOLOv5: Core Elements and Innovations in Object Detection
Suning Technology
Suning Technology
Oct 15, 2020 · Artificial Intelligence

How AI Powers Offline Product Recognition in Smart Retail Stores

This lecture details the evolution of product recognition algorithms from traditional image classification to deep‑learning‑based object detection, discusses challenges in dense retail scenes, presents solutions like rotated bounding boxes and multi‑source sensor fusion, and explains practical deployment in digital and unmanned stores.

Deep Learningdense sceneobject detection
0 likes · 18 min read
How AI Powers Offline Product Recognition in Smart Retail Stores
Suning Technology
Suning Technology
Oct 2, 2020 · Artificial Intelligence

How Precise Customer‑Flow Algorithms Transform Retail with AI Vision

This article explains how AI‑driven precise customer‑flow algorithms—covering pedestrian detection, full‑scene tracking, and person re‑identification—enable accurate offline traffic analysis, real‑time shopper profiling, and data‑driven store management for modern retail environments.

customer flowmulti-camera trackingobject detection
0 likes · 18 min read
How Precise Customer‑Flow Algorithms Transform Retail with AI Vision
360 Quality & Efficiency
360 Quality & Efficiency
Sep 18, 2020 · Artificial Intelligence

Data Augmentation Techniques for Improving Object Detection Model Robustness

To enhance object detection robustness, the article discusses various data augmentation methods—including rotation, flipping, random cropping, scaling, color jitter, blurring, transparency adjustment, and image partitioning—providing code examples and illustrating their impact on model performance with before‑and‑after results.

Computer VisionPythondata augmentation
0 likes · 7 min read
Data Augmentation Techniques for Improving Object Detection Model Robustness
Taobao Frontend Technology
Taobao Frontend Technology
Jun 2, 2020 · Artificial Intelligence

How AI Can Auto‑Detect UI Components for Seamless Front‑End Code Generation

This article explains how to use deep‑learning object detection to automatically recognize UI components in design drafts, generate a smart JSON description, and convert it into component‑based front‑end code, covering problem analysis, dataset preparation, algorithm selection, model training, evaluation, and deployment.

PipcookUI detectionai
0 likes · 30 min read
How AI Can Auto‑Detect UI Components for Seamless Front‑End Code Generation
Meituan Technology Team
Meituan Technology Team
May 21, 2020 · Artificial Intelligence

CenterMask: Single-Shot Instance Segmentation with Point Representation

CenterMask is a single‑shot, anchor‑free instance segmentation framework that predicts a coarse shape from each object’s center point and a full‑image saliency map, multiplies them to produce precise masks, and achieves competitive COCO AP while running faster than two‑stage methods like Mask R-CNN.

CenterMaskDeep Learningobject detection
0 likes · 15 min read
CenterMask: Single-Shot Instance Segmentation with Point Representation
HomeTech
HomeTech
Apr 8, 2020 · Artificial Intelligence

Application of Deep Learning for Cover Image Selection in Autohome Forum Articles

This paper presents a deep learning-based approach for selecting cover images in Autohome forum articles, employing Faster R-CNN for object detection, Mask R-CNN for human keypoint detection, and MobileNetV2 for attribute recognition, achieving an overall accuracy of 81.5%.

Cover Image SelectionMobileNetkeypoint detection
0 likes · 15 min read
Application of Deep Learning for Cover Image Selection in Autohome Forum Articles
Tencent Cloud Developer
Tencent Cloud Developer
Mar 6, 2020 · Artificial Intelligence

WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment

The paper presents a lightweight, anchor‑free CenterNet‑based object‑ness detector for WeChat’s Scan feature, built on a ShuffleNetV2 backbone with enlarged 5×5 depth‑wise convolutions, a streamlined detection head, and a Pyramid Interpolation Module, then quantized, ONNX‑converted and NCNN‑deployed to achieve a 436 KB model running in ~15 ms per frame on an iPhone 8 CPU.

CenterNetMobile AIModel Optimization
0 likes · 12 min read
WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment
DataFunTalk
DataFunTalk
Feb 20, 2020 · Artificial Intelligence

Perception Technology for Autonomous Heavy Trucks: Methods, Challenges, and Production Considerations

This article reviews perception technologies used in autonomous heavy‑truck systems—including lane‑line detection, obstacle detection, and LiDAR sensing—detailing traditional and deep‑learning approaches, practical challenges on high‑speed highways, and the cost, performance, and reliability issues faced when moving these solutions to mass production.

Deep LearningLiDARautonomous driving
0 likes · 16 min read
Perception Technology for Autonomous Heavy Trucks: Methods, Challenges, and Production Considerations
58 Tech
58 Tech
Jan 15, 2020 · Artificial Intelligence

Mobile AI Vehicle and VIN Recognition: From TensorFlow to TensorFlow Lite Deployment on Android and iOS

This article details how the 58 Used‑Car mobile team built, trained, and optimized TensorFlow‑based object‑detection models for on‑device vehicle and VIN code recognition, covering data preparation, model conversion to TF‑Lite, performance improvements, engineering integration on Android/iOS, and real‑world deployment results.

AndroidMobile AITensorFlow
0 likes · 14 min read
Mobile AI Vehicle and VIN Recognition: From TensorFlow to TensorFlow Lite Deployment on Android and iOS
Tencent Cloud Developer
Tencent Cloud Developer
Dec 26, 2019 · Artificial Intelligence

WeChat Scan-to-Identify (Scan Object) Feature: Overview, Technical Architecture, Data Construction, and Algorithmic Advances

WeChat’s iOS Scan‑to‑Identify feature lets users point a camera at any product or scene to instantly retrieve related e‑commerce, encyclopedia or news content, using a four‑pipeline architecture that builds massive annotated and deduplicated databases, advanced RetinaNet‑based detection, multi‑task metric learning, and scalable training, deployment and scheduling platforms, with plans to extend into domains like facial, vehicle and plant recognition.

Computer VisionWeChatai
0 likes · 34 min read
WeChat Scan-to-Identify (Scan Object) Feature: Overview, Technical Architecture, Data Construction, and Algorithmic Advances
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 20, 2019 · Artificial Intelligence

How AI-Powered Hand Gesture Detection Drove a Double‑11 Celebrity Rock‑Paper‑Scissors Game

This article details how Alibaba leveraged AI-driven hand‑gesture detection and a lightweight SSD‑based object detection model to create an interactive rock‑paper‑scissors game for Double‑11, addressing challenges of undefined gestures, real‑time mobile performance, and data collection, and achieving over 16 million page views and high accuracy.

Mobile AIReal-time inferenceSSD
0 likes · 22 min read
How AI-Powered Hand Gesture Detection Drove a Double‑11 Celebrity Rock‑Paper‑Scissors Game
DataFunTalk
DataFunTalk
Nov 27, 2019 · Artificial Intelligence

Front‑Fusion Based Recognition Pipeline for High‑Precision Map Static Obstacle Detection

This article presents a comprehensive front‑fusion recognition pipeline for high‑definition map static obstacle detection, detailing depth‑aware mapping, precise multi‑sensor calibration, point‑cloud registration, and semi‑supervised learning techniques that improve detection accuracy over traditional image‑only methods.

HD mapSemi-supervised LearningSensor Fusion
0 likes · 11 min read
Front‑Fusion Based Recognition Pipeline for High‑Precision Map Static Obstacle Detection
DataFunTalk
DataFunTalk
Nov 14, 2019 · Artificial Intelligence

Sample Imbalance and Importance in Object Detection: IoU‑Balanced Sampling and Prime Sample Attention

The talk analyzes sample imbalance and importance in object detection, proposes IoU‑balanced negative sampling and instance‑balanced positive sampling, introduces the Prime Sample concept with Hierarchical Local Rank, and presents Importance‑based Sample Reweighting and Classification‑Aware Regression Loss, achieving consistent mAP gains without extra overhead.

Computer VisionIoU-balanced samplingMAP
0 likes · 22 min read
Sample Imbalance and Importance in Object Detection: IoU‑Balanced Sampling and Prime Sample Attention
Amap Tech
Amap Tech
Nov 14, 2019 · Artificial Intelligence

Technical Evolution of Ground Marking Recognition for High‑Precision Maps

AMap’s ground‑marking recognition has progressed from simple threshold methods to advanced deep‑learning pipelines—including two‑stage R‑FCN, cascade detectors with local regression, corner‑point and segmentation hybrids, and LiDAR‑based 3‑D PointRCNN—achieving over 99 % recall and sub‑5 cm positional accuracy for high‑precision map production.

Computer VisionDeep Learningground marking
0 likes · 15 min read
Technical Evolution of Ground Marking Recognition for High‑Precision Maps
Xianyu Technology
Xianyu Technology
Sep 12, 2019 · Artificial Intelligence

Deep Learning for Automated Module Detection in Taobao 99 Promotion Pages

This study presents a deep‑learning pipeline that employs a Cascade‑RCNN with Feature Pyramid Network to automatically detect and refine modules and their internal elements on Taobao’s 99‑promotion pages, achieving roughly 98 % precision and recall on a thousand‑image validation set and paving the way for broader e‑commerce event applications.

Cascade R-CNNComputer VisionDeep Learning
0 likes · 7 min read
Deep Learning for Automated Module Detection in Taobao 99 Promotion Pages
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Aug 30, 2019 · Artificial Intelligence

Distinguish Leopards vs Jaguars Using ModelArts AutoML

This guide walks you through configuring Huawei Cloud ModelArts, creating an AutoML image‑classification project, labeling cat and leopard photos, training a model, deploying it as an online service, and testing predictions to accurately differentiate between leopards and jaguars.

Huawei CloudImage ClassificationModelArts
0 likes · 5 min read
Distinguish Leopards vs Jaguars Using ModelArts AutoML
Xianyu Technology
Xianyu Technology
Jul 9, 2019 · Artificial Intelligence

Complex Background Content Extraction Using Detection and GAN Networks

The proposed UI2CODE pipeline first recalls UI elements with an object detector, then uses gradient cues to separate simple from complex regions and applies an SRGAN to restore foreground details in challenging backgrounds, achieving higher precision, recall, and localization than GrabCut and Deeplab, though it demands extensive multi‑scale training data.

GANImage Processingai
0 likes · 4 min read
Complex Background Content Extraction Using Detection and GAN Networks
DataFunTalk
DataFunTalk
Jun 26, 2019 · Artificial Intelligence

Pony.ai Perception System: Combining Traditional and Deep Learning Methods for 2D and 3D Object Detection

This article outlines Pony.ai's perception pipeline, comparing traditional and deep‑learning approaches for 2D and 3D object detection, detailing sensor fusion, detection methods, challenges such as occlusion and distance estimation, and how hybrid techniques improve accuracy for autonomous driving.

3D detectionSensor Fusionautonomous driving
0 likes · 11 min read
Pony.ai Perception System: Combining Traditional and Deep Learning Methods for 2D and 3D Object Detection
NetEase Media Technology Team
NetEase Media Technology Team
Apr 26, 2019 · Artificial Intelligence

Intelligent Cover Image Selection System for News Articles: Image Quality Assessment and Smart Cropping

The article describes an intelligent cover‑image selection system for NetEase News that automatically filters unsuitable illustrations, assesses image quality with a pairwise‑trained deep model across clarity, color and composition, and smartly crops images using aspect‑ratio‑aware object detection, dramatically cutting manual editing and enabling confidence‑based automatic publishing.

Computer VisionImage CroppingNeural Network
0 likes · 11 min read
Intelligent Cover Image Selection System for News Articles: Image Quality Assessment and Smart Cropping
HomeTech
HomeTech
Apr 18, 2019 · Artificial Intelligence

An Overview of Image Processing Techniques and Common Tools for Beginners

This article provides a concise introduction to image processing, covering its hierarchical structure, fundamental techniques such as classification, detection, segmentation, geometric transformation, and the most widely used libraries and deep‑learning frameworks for newcomers.

Computer VisionImage ClassificationImage Processing
0 likes · 9 min read
An Overview of Image Processing Techniques and Common Tools for Beginners
Tencent Cloud Developer
Tencent Cloud Developer
Apr 16, 2019 · Artificial Intelligence

Building Image Recognition Systems: From Basics to Advanced AI Techniques

This article summarizes a computer‑vision salon where Dr. Ji Yongnan explains imaging pipelines, traditional feature‑based methods, deep‑learning breakthroughs, Tencent Cloud AI services, real‑world case studies, and answers audience questions about machine‑vision versus computer‑vision and data‑scarcity challenges.

AI applicationsComputer VisionDeep Learning
0 likes · 18 min read
Building Image Recognition Systems: From Basics to Advanced AI Techniques
Hulu Beijing
Hulu Beijing
Apr 2, 2019 · Artificial Intelligence

From Object Detection to Language Models: A Deep Dive into AI Advances

This article surveys the evolution of object detection models—comparing one‑stage and two‑stage approaches, their performance trade‑offs, and recent state‑of‑the‑art methods—while also outlining key concepts and breakthroughs in natural language processing, highlighting the impact of deep‑learning models such as BERT.

AI researchBERTDeep Learning
0 likes · 14 min read
From Object Detection to Language Models: A Deep Dive into AI Advances
DataFunTalk
DataFunTalk
Mar 15, 2019 · Artificial Intelligence

A Comprehensive Overview of Deep Learning Applications in Computer Vision

This article provides an extensive review of deep learning techniques applied to computer vision, covering the evolution of CNN architectures, image and video processing tasks, 2.5‑D and 3‑D reconstruction, object detection, segmentation, tracking, SLAM, and various practical applications such as AR, content retrieval, and autonomous driving.

CNNComputer VisionImage Processing
0 likes · 22 min read
A Comprehensive Overview of Deep Learning Applications in Computer Vision
MaGe Linux Operations
MaGe Linux Operations
Nov 16, 2018 · Artificial Intelligence

Real-Time Object Detection with OpenCV, Python, and Deep Learning

This tutorial walks through extending a deep‑learning object detector to process live video streams using OpenCV and Python, covering setup, command‑line arguments, model loading, frame‑by‑frame detection, drawing bounding boxes, FPS measurement, and performance tips.

Computer VisionVideo Streamobject detection
0 likes · 9 min read
Real-Time Object Detection with OpenCV, Python, and Deep Learning
Sohu Tech Products
Sohu Tech Products
Oct 24, 2018 · Artificial Intelligence

Intelligent News Image Formatter: AI‑Based Cropping and Selection System for News List Images

This article introduces the Intelligent News Formatter, an AI‑driven system that tackles news‑app list‑image problems by using face detection, object detection, deep‑learning based cropping, image quality filtering, and similarity removal to automatically produce aesthetically pleasing and information‑rich thumbnails.

Face DetectionImage Processingai
0 likes · 14 min read
Intelligent News Image Formatter: AI‑Based Cropping and Selection System for News List Images
Qunar Tech Salon
Qunar Tech Salon
Sep 11, 2018 · Artificial Intelligence

Overview of Deep Learning Object Detection Methods and Detailed Implementation of Faster R‑CNN

This article reviews major deep‑learning object detection approaches—including one‑stage YOLO and SSD and two‑stage RCNN, Fast RCNN, and Faster RCNN—then provides a step‑by‑step explanation of Faster RCNN’s architecture, region‑proposal network, RoI pooling, loss functions, and sample PyTorch code.

Computer VisionFaster R-CNNPyTorch
0 likes · 20 min read
Overview of Deep Learning Object Detection Methods and Detailed Implementation of Faster R‑CNN