Tagged articles

object detection

112 articles · Page 1 of 2

Apr 1, 2026 · Artificial Intelligence

Achieving Pro‑Level Vision Detection with Minimal Cost: Fine‑Tuning Amazon Nova Lite

By fine‑tuning Amazon Nova Lite 1.0 on Amazon Bedrock, the study demonstrates how a small training dataset can dramatically improve instruction following and reduce detection boxes—up to 92% fewer—while achieving Pro‑grade accuracy in aerial group detection and low‑light monitoring, all at a fraction of the cost.

Amazon BedrockAmazon Nova Litecomputer vision

0 likes · 20 min read

Achieving Pro‑Level Vision Detection with Minimal Cost: Fine‑Tuning Amazon Nova Lite

AIWalker

Mar 16, 2026 · Artificial Intelligence

DETR Drops Hungarian Matching: Double Training Speed, +4.2 AP on Large Objects

Beyond-Hungarian replaces the costly Hungarian assignment in DETR with a differentiable, query‑free matching scheme that halves training latency, boosts large‑object AP by 4.2 points, and introduces a GT‑Probe module and dual‑loss framework, while detailing trade‑offs, ablations, and future challenges.

DETRGT-ProbeHungarian matching

0 likes · 14 min read

DETR Drops Hungarian Matching: Double Training Speed, +4.2 AP on Large Objects

AIWalker

Mar 11, 2026 · Artificial Intelligence

Why 90% of DETR Queries Stay Idle and How PaQ‑DETR Boosts mAP by 4.2%

The article dissects the query‑activation imbalance in DETR‑based detectors, explains PaQ‑DETR’s pattern‑sharing and quality‑aware assignment mechanisms, and shows how these jointly raise detection mAP by up to 4.2% on COCO with less than 5% extra FLOPs.

DETRPaQ-DETRobject detection

0 likes · 15 min read

Why 90% of DETR Queries Stay Idle and How PaQ‑DETR Boosts mAP by 4.2%

Code Mala Tang

Mar 5, 2026 · Artificial Intelligence

Master YOLOv12: A Step‑by‑Step Guide to Build, Train, and Deploy Custom Models

This tutorial walks readers through the fundamentals of YOLOv12, covering model variants, dataset preparation with Roboflow, optional FlashAttention acceleration, installation, model selection, training commands, post‑training tasks such as tracking, validation, inference, exporting to ONNX, and benchmarking, all with concrete code snippets and practical tips.

FlashAttentionModel TrainingPython

0 likes · 8 min read

Master YOLOv12: A Step‑by‑Step Guide to Build, Train, and Deploy Custom Models

Code Mala Tang

Mar 1, 2026 · Artificial Intelligence

Why YOLO Dominates Real-Time Object Detection: A Complete Guide

This article provides a comprehensive overview of the YOLO (You Only Look Once) algorithm, explaining its core principles, architecture, version history, training workflow, real‑world applications, strengths, and current limitations for modern computer‑vision tasks.

Deep LearningReal-timeYOLO

0 likes · 9 min read

Why YOLO Dominates Real-Time Object Detection: A Complete Guide

xkx's Tech General Store

Jan 27, 2026 · Artificial Intelligence

AI Era Survival: Using YOLOv3 for Accurate Pig Detection

The article explains how YOLOv3’s architectural upgrades—Darknet‑53 backbone, three‑scale feature fusion, refined anchors and multi‑label classification, plus dynamic input sizing—enable a pig‑recognition model trained on 2,456 images to achieve up to 20% higher detection rates and AP scores of 0.673–0.981.

Deep LearningModel TrainingPig Detection

0 likes · 8 min read

AI Era Survival: Using YOLOv3 for Accurate Pig Detection

AI Frontier Lectures

Jan 15, 2026 · Artificial Intelligence

What Makes YOLO26 the Next Leap in Edge AI Object Detection?

YOLO26, the latest Ultralytics release, introduces a unified model family with five sizes, removes distribution focal loss, offers end‑to‑end inference without NMS, adds progressive loss balancing and the MuSGD optimizer, and delivers up to 43% faster CPU performance, making it ideal for edge and real‑world vision applications.

Model OptimizationYOLO26edge AI

0 likes · 12 min read

What Makes YOLO26 the Next Leap in Edge AI Object Detection?

xkx's Tech General Store

Jan 12, 2026 · Artificial Intelligence

How Traditional Programmers Can Thrive in the AI Era: Understanding YOLOv2 Architecture and Implementation

This article walks through YOLOv2’s eight core upgrades over YOLOv1, explains the design rationale behind each change, provides detailed PyTorch code for the backbone, neck, head and prediction layers, demonstrates training on COCO, and outlines further optimization directions for real‑world object detection.

PyTorchResNetYOLOv2

0 likes · 16 min read

How Traditional Programmers Can Thrive in the AI Era: Understanding YOLOv2 Architecture and Implementation

php Courses

Jan 7, 2026 · Backend Development

How to Use PHP and OpenCV for Real-Time Face Detection with a Webcam

This guide walks through installing OpenCV and php-facedetect, writing PHP code to capture webcam images, crop them, run Pico face detection, and overlay bounding boxes, providing a complete example for object detection using PHP.

Webcamface detectionobject detection

0 likes · 5 min read

How to Use PHP and OpenCV for Real-Time Face Detection with a Webcam

xkx's Tech General Store

Dec 30, 2025 · Artificial Intelligence

From Theory to Practice: Reproducing YOLOv1 – A Step‑by‑Step Guide for Traditional Programmers

This article provides a comprehensive, hands‑on walkthrough of YOLOv1—from its single‑stage detection principles and core architectural questions to a full PyTorch implementation, training pipeline, common pitfalls, and a live camera demo—targeted at developers transitioning into AI.

Deep LearningPyTorchResNet

0 likes · 10 min read

From Theory to Practice: Reproducing YOLOv1 – A Step‑by‑Step Guide for Traditional Programmers

Liangxu Linux

Nov 6, 2025 · Artificial Intelligence

8 Must‑Explore Open‑Source Projects: AI Prompt Tools, Voice Transcription, Browser Engine & More

This article introduces eight noteworthy open‑source projects—including an interactive prompt‑engineering tutorial, Claude Cookbooks, an offline speech‑to‑text tool, an eBook‑to‑audiobook converter, the Servo browser engine, a free programming‑books collection, a real‑time object‑detection model, and other popular repositories—each with brief descriptions and GitHub links.

AI toolsGitHubPrompt engineering

0 likes · 7 min read

8 Must‑Explore Open‑Source Projects: AI Prompt Tools, Voice Transcription, Browser Engine & More

HyperAI Super Neural

Sep 29, 2025 · Artificial Intelligence

8 Popular Remote Sensing Object Detection Datasets with One-Click Downloads

This article presents a curated list of eight widely used remote sensing object detection datasets covering indoor scenes, landslides, drone imagery, crop diseases, safety vests, human fractures, urban issues, and plant diseases, each with size estimates and direct download links for researchers.

AIcomputer visiondatasets

0 likes · 10 min read

8 Popular Remote Sensing Object Detection Datasets with One-Click Downloads

AIWalker

Sep 24, 2025 · Artificial Intelligence

Top 2025 Object Detection Research Paths: From Grounding DINO 1.5 to Open‑Set Breakthroughs

The article outlines four key innovation avenues—architecture redesign, task expansion, information fusion, and paradigm shift—highlighting recent works such as Mr. DETR, Grounding DINO 1.5, SM3Det, and RoboFusion, and offers a curated list of 176 cutting‑edge object‑detection papers with code and datasets for free.

Deep Learningmodel architectureobject detection

0 likes · 8 min read

Top 2025 Object Detection Research Paths: From Grounding DINO 1.5 to Open‑Set Breakthroughs

php Courses

Aug 22, 2025 · Backend Development

How to Use PHP’s is_object() to Distinguish Objects from Other Types

This article explains PHP’s is_object() function, detailing its syntax, parameters, and return values, and demonstrates through code examples how to check whether variables such as objects and arrays are objects, helping developers avoid type errors at runtime.

PHPType Checkingis_object

0 likes · 3 min read

How to Use PHP’s is_object() to Distinguish Objects from Other Types

AIWalker

Aug 19, 2025 · Artificial Intelligence

Easy Ways to Boost YOLO: Systematic Review of Versions and Use Cases

This article systematically reviews every YOLO version, classifies five major improvement directions—architecture enhancements, efficiency optimizations, multi‑task learning, temporal modeling, and domain‑specific customizations—provides concrete paper references, code links, and dataset resources to help researchers and engineers quickly locate and apply the most effective techniques.

Deep LearningYOLOmodel improvement

0 likes · 8 min read

Easy Ways to Boost YOLO: Systematic Review of Versions and Use Cases

Amap Tech

Jul 14, 2025 · Artificial Intelligence

Zero-Shot Domain Adaptation for Object Detection: How UPRE Boosts Cross-Domain Performance

The UPRE framework introduces multi‑view domain prompts and unified representation enhancement to achieve zero‑shot domain adaptation for object detection, dramatically improving detection accuracy on unseen target domains across diverse visual scenarios.

Prompt engineeringcross-domain learningobject detection

0 likes · 10 min read

Zero-Shot Domain Adaptation for Object Detection: How UPRE Boosts Cross-Domain Performance

Amap Tech

Jul 14, 2025 · Artificial Intelligence

How UPRE Achieves Zero-Shot Domain Adaptation for Object Detection with Unified Prompts

The UPRE paper, presented at ICCV, introduces a multi‑view domain prompt and a unified representation enhancement to enable zero‑shot domain adaptation for object detection, achieving state‑of‑the‑art performance across diverse weather, geographic, and synthetic‑to‑real scenarios.

Prompt engineeringcomputer visionobject detection

0 likes · 10 min read

How UPRE Achieves Zero-Shot Domain Adaptation for Object Detection with Unified Prompts

Rare Earth Juejin Tech Community

Jun 27, 2025 · Artificial Intelligence

Image Encryption, Watermarking, Detection & Green Screen Removal in Python

This tutorial walks through Python-based computer‑vision techniques—including XOR‑based image encryption, mask and ROI methods, digital watermark embedding via bit‑plane and LSB, sensitivity‑driven object detection, and HSV‑based green‑screen removal—providing complete code snippets and practical guidance for rapid AI‑assisted learning.

Pythoncomputer visiongreen screen removal

0 likes · 17 min read

Image Encryption, Watermarking, Detection & Green Screen Removal in Python

AIWalker

May 26, 2025 · Artificial Intelligence

VisionReasoner: RL‑Unified Model Beats YOLO‑World Detection, Segmentation, Counting

VisionReasoner presents a reinforcement‑learning‑driven unified framework that simultaneously tackles detection, segmentation, and counting tasks, employing a novel multi‑target cognition strategy and efficient Hungarian‑based matching, and demonstrates substantial gains—29.1% on COCO detection, 22.1% on ReasonSeg, and 15.3% on CountBench—using only 7,000 training samples.

Multi-Task LearningSegmentationVisionReasoner

0 likes · 20 min read

VisionReasoner: RL‑Unified Model Beats YOLO‑World Detection, Segmentation, Counting

AIWalker

May 22, 2025 · Artificial Intelligence

VisionReasoner: RL‑Unified System Beats YOLO‑World on Detection, Segmentation, Counting

VisionReasoner introduces a reinforcement‑learning‑driven unified framework that simultaneously handles detection, segmentation, and counting tasks within a single model, achieving 29.1% higher COCO detection AP, 22.1% better ReasonSeg segmentation, and 15.3% improvement on CountBench, while requiring only 7,000 training samples and offering efficient multi‑target matching via batch computation and the Hungarian algorithm.

LVLMObject CountingVisionReasoner

0 likes · 19 min read

VisionReasoner: RL‑Unified System Beats YOLO‑World on Detection, Segmentation, Counting

AIWalker

May 18, 2025 · Artificial Intelligence

YOLOE: Open‑Source Real‑Time Anything Detector Beats YOLO‑World v2

YOLOE unifies object detection and segmentation in a single efficient model that supports text, visual, and prompt‑free inference, introduces RepRTA, SAVPE, and LRPC strategies, and achieves higher AP with up to three‑fold lower training cost and 1.4× faster inference on GPUs and mobile devices, as demonstrated by extensive LVIS and COCO experiments.

Prompt engineeringReal-timeYOLOE

0 likes · 29 min read

YOLOE: Open‑Source Real‑Time Anything Detector Beats YOLO‑World v2

AIWalker

May 14, 2025 · Artificial Intelligence

How HGO‑YOLO Achieves 87.4% Accuracy at 56 FPS with Only 4.6 MB Parameters

This paper presents HGO‑YOLO, a lightweight real‑time anomaly‑behavior detector that integrates HGNetv2 and GhostConv into YOLOv8, achieving 87.4% mAP with just 4.6 MB of parameters and 56 FPS on CPU, and validates its performance across multiple datasets and hardware platforms.

Anomaly DetectionLightweight ModelsYOLO

0 likes · 25 min read

How HGO‑YOLO Achieves 87.4% Accuracy at 56 FPS with Only 4.6 MB Parameters

AIWalker

May 12, 2025 · Artificial Intelligence

DefMamba: A Deformable Multi‑Scale Visual Foundation Model that Boosts Vision Tasks

DefMamba introduces a multi‑scale backbone, deformable Mamba modules, and a dynamic scanning strategy to preserve image spatial structure, achieving state‑of‑the‑art performance on image classification, object detection, and semantic segmentation benchmarks.

DefMambaSemantic Segmentationcomputer vision

0 likes · 23 min read

DefMamba: A Deformable Multi‑Scale Visual Foundation Model that Boosts Vision Tasks

DataFunTalk

Apr 18, 2025 · Artificial Intelligence

Applying ByteDance’s Doubao‑1.5 Vision Model for Image Counting and Automated Annotation

The article demonstrates how ByteDance’s new Doubao‑1.5 multimodal model can be used to locate and count objects in images—such as sushi plates, street signs, and cartoon hats—by generating coordinates and overlaying visual annotations through a concise Python script.

AIDoubaoImage Annotation

0 likes · 5 min read

Applying ByteDance’s Doubao‑1.5 Vision Model for Image Counting and Automated Annotation

AIWalker

Mar 13, 2025 · Artificial Intelligence

YOLOE: Real‑Time Open‑World Object Detection and Segmentation Unveiled

The paper introduces YOLOE, a new YOLO‑based model that supports text, visual, and no‑prompt open‑world detection and segmentation, detailing its lightweight RepRTA, SAVPE, and LRPC modules and showing benchmark gains in speed and zero‑shot performance on LVIS and COCO.

BenchmarkYOLOEcomputer vision

0 likes · 9 min read

YOLOE: Real‑Time Open‑World Object Detection and Segmentation Unveiled

AIWalker

Mar 1, 2025 · Artificial Intelligence

Lightweight Remote Sensing Backbone LSKNet and Strip R-CNN: Design, Benchmarks, and Open‑Source Release

The NK‑Remote repository introduces LSKNet and Strip R‑CNN, two lightweight yet powerful models for remote‑sensing object detection that dynamically adjust receptive fields and combine square‑and‑strip convolutions, achieving state‑of‑the‑art performance on benchmarks such as DOTA, FAIR1M, HRSC2016, and DIOR.

BenchmarkDeep LearningJDet

0 likes · 9 min read

Lightweight Remote Sensing Backbone LSKNet and Strip R-CNN: Design, Benchmarks, and Open‑Source Release

AsiaInfo Technology: New Tech Exploration

Feb 24, 2025 · Artificial Intelligence

Can Multi‑Teacher Distillation Overcome Catastrophic Forgetting in Continual Learning?

This paper proposes a multi‑teacher distillation framework for continual learning that combines active data rehearsal with feature‑decoupled distillation, demonstrating superior performance on PASCAL VOC and COCO benchmarks while mitigating catastrophic forgetting and balancing stability‑plasticity trade‑offs.

AICatastrophic ForgettingContinual Learning

0 likes · 12 min read

Can Multi‑Teacher Distillation Overcome Catastrophic Forgetting in Continual Learning?

AIWalker

Feb 23, 2025 · Artificial Intelligence

D-FINE Redefines Bounding-Box Regression to Reach State-of-the-Art Real-Time Detection

D-FINE introduces Fine-grained Distribution Refinement and Global Optimal Localization Self-Distillation to overhaul DETR's bounding-box regression, achieving 54‑59% AP on COCO and Objects365 at 78‑124 FPS while surpassing YOLO and RT-DETR in both accuracy and speed.

DETRReal-timeSelf‑Distillation

0 likes · 25 min read

D-FINE Redefines Bounding-Box Regression to Reach State-of-the-Art Real-Time Detection

AIWalker

Feb 19, 2025 · Artificial Intelligence

YOLOv12 Unveiled: Boosted Performance and Speed for Real‑Time Detection

YOLOv12 introduces an attention‑centric architecture, a lightweight regional attention module, and the R‑ELAN aggregation network, delivering consistent mAP gains and lower latency across N, S, M, L and X model scales while surpassing previous YOLO versions and other real‑time detectors.

Attention MechanismBenchmarkReal-time

0 likes · 8 min read

YOLOv12 Unveiled: Boosted Performance and Speed for Real‑Time Detection

Python Programming Learning Circle

Dec 19, 2024 · Artificial Intelligence

Overview of Microsoft’s Open‑Source Computer Vision Recipes Library

The article introduces Microsoft’s open‑source Computer Vision Recipes library, describing its purpose, target audience, repository links, supported vision scenarios such as image classification, similarity, detection, key‑point, segmentation, action recognition, multi‑object tracking and crowd counting, and provides guidance on using PyTorch, Azure and GPU resources.

AzureOpen-sourcePyTorch

0 likes · 7 min read

Overview of Microsoft’s Open‑Source Computer Vision Recipes Library

php Courses

Dec 18, 2024 · Artificial Intelligence

Using PHP to Access the Camera and Perform Face Detection with OpenCV

This article explains how to install OpenCV and php-facedetect libraries, write PHP code to capture images from a webcam, perform face detection using the pico library, and display the results, providing a step‑by‑step guide for object detection with PHP.

CameraPHPcomputer vision

0 likes · 5 min read

Using PHP to Access the Camera and Perform Face Detection with OpenCV

Alibaba Cloud Developer

Nov 22, 2024 · Artificial Intelligence

Master YOLOv8: End-to-End Guide to Object Detection, Training, and Deployment

This comprehensive tutorial walks you through YOLOv8 object detection—from environment setup and dataset preparation to model training, validation, testing, and conversion to ONNX and TensorRT—providing clear commands, code snippets, and visual results for each step.

Model TrainingONNXTensorRT

0 likes · 8 min read

Master YOLOv8: End-to-End Guide to Object Detection, Training, and Deployment

php Courses

Oct 11, 2024 · Artificial Intelligence

Using PHP to Access a Webcam and Perform Object (Face) Detection with OpenCV

This tutorial explains how to install OpenCV and php-facedetect, write PHP code to capture images from a webcam, perform face detection, and display the results, providing step‑by‑step commands and a complete example script.

PHPcomputer visionface detection

0 likes · 6 min read

Using PHP to Access a Webcam and Perform Object (Face) Detection with OpenCV

Rare Earth Juejin Tech Community

Aug 22, 2024 · Artificial Intelligence

Understanding Faster R-CNN: Architecture, Training, and Experimental Results

This article provides an in‑depth overview of the Faster R‑CNN object detection framework, covering its background, key innovations such as the Region Proposal Network, detailed algorithmic principles, training procedures, experimental results on PASCAL VOC and MS COCO, and a reproducible PyTorch implementation.

Deep LearningFaster R-CNNPyTorch

0 likes · 14 min read

Understanding Faster R-CNN: Architecture, Training, and Experimental Results

160 Technical Team

Jul 29, 2024 · Artificial Intelligence

How YOLO Transforms Medical Report Screening and Occlusion Detection

Leveraging the YOLO family of deep‑learning models, this study demonstrates efficient filtering of irrelevant medical images, accurate classification of textual reports, and robust detection of occluding objects, achieving high precision and speed on both CPU and GPU, while outlining training details, performance metrics, and future improvements.

Deep LearningYOLOmedical imaging

0 likes · 17 min read

How YOLO Transforms Medical Report Screening and Occlusion Detection

Rare Earth Juejin Tech Community

May 10, 2024 · Artificial Intelligence

Real-Time Dog Detection in Browser Using TensorFlow.js and MobileNet V2

This guide demonstrates how to build a web‑based real‑time dog detector that accesses the phone camera via the browser, processes video frames with TensorFlow.js and a pre‑trained COCO‑SSD MobileNet V2 model, and plays an audio alert when a dog is recognized, all deployed on an Android device using Termux.

AndroidMobileNetTensorFlow.js

0 likes · 8 min read

Real-Time Dog Detection in Browser Using TensorFlow.js and MobileNet V2

php Courses

Apr 16, 2024 · Artificial Intelligence

Using PHP and OpenCV for Camera‑Based Object Detection

This tutorial explains how to install required libraries, write PHP code that captures images from a webcam, uses OpenCV and php‑facedetect to detect faces, and displays the results with annotated bounding boxes, providing a foundation for further object detection projects.

CameraPHPcomputer vision

0 likes · 6 min read

Using PHP and OpenCV for Camera‑Based Object Detection

Alibaba Cloud Big Data AI Platform

Mar 20, 2024 · Artificial Intelligence

How M2Doc Boosts Document Layout Analysis with Plug‑in Multimodal Fusion

This article introduces M2Doc, a plug‑in multimodal fusion approach that equips visual‑only object detectors with textual and semantic awareness, detailing its early‑ and late‑fusion modules, experimental validation on DocLayNet, M6Doc and PubLayNet, and future research directions.

AIM2Docdocument layout analysis

0 likes · 8 min read

How M2Doc Boosts Document Layout Analysis with Plug‑in Multimodal Fusion

Huolala Tech

Jan 25, 2024 · Artificial Intelligence

How Open‑Vocabulary Detection and Segment‑Anything Are Revolutionizing Visual AI at Huolala

This article reviews traditional computer‑vision tasks—classification, detection, and segmentation—highlights their limitations, introduces open‑vocabulary detection and segment‑anything models such as GLIP, Grounding DINO, and SAM, and details how Huolala applies these advances to driver‑license, packing, and vehicle‑sticker inspections for safer, more efficient AI‑driven operations.

Segmentationcomputer visionobject detection

0 likes · 20 min read

How Open‑Vocabulary Detection and Segment‑Anything Are Revolutionizing Visual AI at Huolala

DataFunTalk

Nov 24, 2023 · Artificial Intelligence

Open Vocabulary Detection Contest 2023: Summary of Winning Teams' Technical Solutions

The article reviews the Open Vocabulary Detection Contest organized by the Chinese Society of Image and Graphics and 360 AI Institute, describing the competition setup, dataset characteristics, and detailed winning approaches that combine Detic, CLIP, prompt learning, and multi‑stage pipelines to achieve strong few‑shot and zero‑shot object detection performance.

CLIPOpen-Vocabulary Detectioncompetition

0 likes · 17 min read

Open Vocabulary Detection Contest 2023: Summary of Winning Teams' Technical Solutions

Alibaba Cloud Big Data AI Platform

Oct 8, 2023 · Artificial Intelligence

Why the Scale‑Aware Modulation Transformer Outperforms CNNs and Vision Transformers with Fewer Parameters

The Scale‑Aware Modulation Transformer (SMT) introduces a lightweight SAM module and an Evolutionary Hybrid Network that together achieve higher accuracy on ImageNet, COCO, and ADE20K while using significantly fewer parameters and FLOPs than existing CNN and Transformer baselines.

SMTScale‑Aware ModulationSemantic Segmentation

0 likes · 12 min read

Why the Scale‑Aware Modulation Transformer Outperforms CNNs and Vision Transformers with Fewer Parameters

Huolala Tech

Sep 28, 2023 · Artificial Intelligence

How Mobile AI Transforms Logistics: Real‑World Image Algorithms at Huolala

This article explores Huolala's deployment of mobile AI image algorithms for driver document verification and vehicle sticker inspection, detailing model design, lightweighting, hybrid processing, data stream handling, and on‑device deployment that boost efficiency, privacy, and real‑time performance in logistics operations.

edge computingimage recognitionlogistics

0 likes · 13 min read

How Mobile AI Transforms Logistics: Real‑World Image Algorithms at Huolala

Rare Earth Juejin Tech Community

Aug 17, 2023 · Artificial Intelligence

Getting Started with YOLOv8 on the Ultralytics Platform: Installation, Command‑Line Usage, and Model Training

This article introduces the YOLOv8 object‑detection framework on the Ultralytics platform, covering environment setup, command‑line and Python APIs for inference, model‑file options, result interpretation, data annotation, training procedures, and exporting models to various deployment formats.

Model TrainingPythonUltralytics

0 likes · 14 min read

Getting Started with YOLOv8 on the Ultralytics Platform: Installation, Command‑Line Usage, and Model Training

Xiaohongshu Tech REDtech

Jun 20, 2023 · Artificial Intelligence

Open-Vocabulary Object Attribute Recognition with OvarNet: A Unified Framework for Detection and Attribute Classification

At CVPR 2023 the Xiaohongshu team presented OvarNet, a unified one‑stage Faster‑RCNN model built on CLIP that uses prompt learning and knowledge distillation to jointly detect objects and recognize open‑vocabulary attributes, achieving state‑of‑the‑art results on VAW, MS‑COCO, LSA and OVAD datasets.

Knowledge DistillationMultimodal Learningattribute recognition

0 likes · 12 min read

Open-Vocabulary Object Attribute Recognition with OvarNet: A Unified Framework for Detection and Attribute Classification

Network Intelligence Research Center (NIRC)

Jun 5, 2023 · Artificial Intelligence

How DETR and Its Successors Evolve: A Deep Dive into the DETR Series for Object Detection

This article reviews the original DETR model, analyzes its strengths and weaknesses, and then examines two major follow‑up works—Deformable‑DETR and DAB‑DETR—explaining how they modify attention mechanisms, introduce deformable convolutions and dynamic anchor boxes to accelerate convergence and improve small‑object detection.

DAB-DETRDETRDeformable-DETR

0 likes · 12 min read

How DETR and Its Successors Evolve: A Deep Dive into the DETR Series for Object Detection

360 Tech Engineering

May 6, 2023 · Artificial Intelligence

Open‑Vocabulary Object Detection: Overview of OVR‑CNN, RegionCLIP, and CORA

This article reviews the evolution of open‑vocabulary object detection, describing the OVR‑CNN paradigm, the RegionCLIP enhancements, and the CORA model with region prompting and anchor pre‑matching, and discusses their impact on future multimodal AI systems.

CLIPCORAOVR-CNN

0 likes · 14 min read

Open‑Vocabulary Object Detection: Overview of OVR‑CNN, RegionCLIP, and CORA

DataFunTalk

Apr 25, 2023 · Artificial Intelligence

DAMO-YOLO: An Efficient Target Detection Framework with NAS, Multi‑Scale Fusion, and Full‑Scale Distillation

This article introduces DAMO‑YOLO, a high‑performance object detection framework that combines low‑cost model customization via MAE‑NAS, an Efficient RepGFPN with HeavyNeck for superior multi‑scale detection, and a full‑scale distillation technique, delivering faster inference, lower FLOPs, and higher accuracy across diverse industrial scenarios.

DistillationModel OptimizationNAS

0 likes · 15 min read

DAMO-YOLO: An Efficient Target Detection Framework with NAS, Multi‑Scale Fusion, and Full‑Scale Distillation

DataFunSummit

Apr 13, 2023 · Artificial Intelligence

ModelScope CV Model Overview: Visual Detection and Keypoint Applications

This article presents a comprehensive overview of ModelScope's computer‑vision models, detailing visual detection and keypoint solutions—including VitDet, YOLOX, res2net, HRNet, and 3D pose models—their architectures, performance highlights, real‑world applications, and future development plans.

AI modelsModelScopekeypoint detection

0 likes · 11 min read

ModelScope CV Model Overview: Visual Detection and Keypoint Applications

Sohu Tech Products

Apr 12, 2023 · Artificial Intelligence

Using Apple CreateML for Object Detection: From Data Annotation to Model Deployment

This article walks through the complete workflow of building an iOS object‑detection model with Apple’s CreateML, covering data collection, JSON annotation, using Roboflow for labeling, configuring training parameters, exporting the model, and integrating it into a Swift app via the Vision framework.

CreateMLSwiftVision

0 likes · 11 min read

Using Apple CreateML for Object Detection: From Data Annotation to Model Deployment

Baidu Tech Salon

Apr 7, 2023 · Artificial Intelligence

Ambiguity-Resistant Semi-supervised Learning (ARSL) for Single-stage Object Detection

ARSL, an ambiguity‑resistant semi‑supervised learning framework for single‑stage object detection, introduces Joint‑Confidence Estimation and Task‑Separation Assignment to resolve selection and assignment ambiguities in pseudo‑labels, thereby markedly improving pseudo‑label quality and achieving state‑of‑the‑art AP gains on COCO benchmarks.

ARSLSemi-supervised Learningcomputer vision

0 likes · 8 min read

Ambiguity-Resistant Semi-supervised Learning (ARSL) for Single-stage Object Detection

Sohu Tech Products

Mar 22, 2023 · Artificial Intelligence

Using Apple CreateML for Object Detection: From Data Annotation to Model Deployment

This article walks through the complete workflow of building an iOS object‑detection model with Apple CreateML, covering data collection, JSON‑based annotation (using Roboflow), training configuration, evaluation metrics, model export, and integration into a Swift app via the Vision framework.

CreateMLVisiondata annotation

0 likes · 11 min read

Baidu Geek Talk

Mar 16, 2023 · Artificial Intelligence

PaddleDetection v2.6 Release: PP-YOLOE Family Expansion and Advanced Detection Algorithms

PaddleDetection v2.6 expands the PP‑YOLOE family with rotating, small‑object, dense‑object, and ultra‑lightweight edge‑GPU models, upgrades PP‑Human and PP‑Vehicle toolboxes, releases semi‑supervised, few‑shot and distillation learning methods, adds numerous state‑of‑the‑art algorithms, and improves infrastructure with Python 3.10, EMA filtering and AdamW support.

BaiduDeep LearningPP-YOLOE

0 likes · 14 min read

PaddleDetection v2.6 Release: PP-YOLOE Family Expansion and Advanced Detection Algorithms

政采云技术

Mar 9, 2023 · Artificial Intelligence

Comprehensive Overview of Object Detection: From Traditional Methods to Modern Deep Learning Models

This article provides a comprehensive overview of object detection, describing traditional sliding‑window approaches, deep‑learning based two‑stage and one‑stage models such as R‑CNN, Faster R‑CNN, YOLO series, and discusses current challenges, improvement directions, and future research trends in the field.

Deep LearningR-CNNYOLO

0 likes · 29 min read

Comprehensive Overview of Object Detection: From Traditional Methods to Modern Deep Learning Models

Meituan Technology Team

Mar 2, 2023 · Artificial Intelligence

Technical Innovations in YOLOv6 3.0 for Real‑Time Object Detection

YOLOv6 3.0 raises real‑time object detection performance to a new peak with 57.2% AP and 29 FPS on a T4 GPU, surpassing YOLOv7‑E6E, and introduces RepBi‑PAN Neck, Anchor‑Aided Training, and Decoupled Location Distillation to boost accuracy and efficiency.

Anchor-Aided TrainingDecoupled Location DistillationRepBi-PAN

0 likes · 13 min read

Technical Innovations in YOLOv6 3.0 for Real‑Time Object Detection

Alibaba Cloud Developer

Dec 19, 2022 · Artificial Intelligence

How AI Transforms Football Video Analysis: Detection, Tracking, and Event Recognition

This article explores how artificial intelligence techniques such as deep learning, object detection, multi‑object tracking, and coordinate projection are applied to football video analysis to automatically detect the ball and players, map their positions onto the field, and recognize key events like shots and goals.

AIcomputer visionobject detection

0 likes · 16 min read

How AI Transforms Football Video Analysis: Detection, Tracking, and Event Recognition

ELab Team

Dec 6, 2022 · Artificial Intelligence

Mastering CreateML: From Data Prep to Object Detection Models on iOS

This article introduces Apple’s CreateML tool, explains its supported model types, shows how to prepare and augment data, provides a Node.js script for generating synthetic training sets, and walks through training, testing, and integrating an object‑detection model into an iOS app.

CreateMLSwiftdata augmentation

0 likes · 17 min read

Mastering CreateML: From Data Prep to Object Detection Models on iOS

Alibaba Cloud Big Data AI Platform

Oct 12, 2022 · Artificial Intelligence

Unlock Vision AI: How EasyCV Streamlines Datasets and Model Training

This article introduces EasyCV, an open‑source all‑in‑one visual algorithm platform that abstracts diverse data sources, provides SOTA self‑supervised models, and offers ready‑to‑download datasets for image classification, object detection, segmentation, and pose estimation, complete with configuration examples.

Deep LearningEasyCVcomputer vision

0 likes · 9 min read

Unlock Vision AI: How EasyCV Streamlines Datasets and Model Training

Laiye Technology Team

Sep 28, 2022 · Artificial Intelligence

Checkbox Detection and State Classification Using YOLOv5

This article describes a comprehensive solution for detecting checkboxes in document images and determining their selected or unselected status by combining YOLOv5 object detection, synthetic and semi‑synthetic data generation, specialized post‑processing, and association logic to handle varied shapes, positions, and markings.

Data SynthesisYOLOv5checkbox detection

0 likes · 13 min read

Checkbox Detection and State Classification Using YOLOv5

Meituan Technology Team

Sep 15, 2022 · Artificial Intelligence

YOLOv6 2.0: Enhanced Object Detection Models and Quantization Solutions

The new YOLOv6 2.0 release upgrades lightweight and medium‑large models with a CSPStackRep backbone, self‑distillation, and a custom quantization pipeline, delivering up to 869 FPS for the quantized YOLOv6‑S and achieving 49.5%/52.5% AP on COCO while halving training time.

COCO benchmarkCSPStackRepQuantization

0 likes · 6 min read

YOLOv6 2.0: Enhanced Object Detection Models and Quantization Solutions

政采云技术

Aug 11, 2022 · Artificial Intelligence

Semi‑Automatic Annotation with Label Studio and YOLOv5: Installation, Project Setup, and Model Training

This guide explains how to combine the open‑source labeling platform Label Studio with the YOLOv5 object‑detection model to achieve semi‑automatic annotation, covering installation of both tools, project creation, dataset configuration, and training a custom YOLOv5 model on your own data.

Label StudioPythonSemi-Automatic Annotation

0 likes · 11 min read

Semi‑Automatic Annotation with Label Studio and YOLOv5: Installation, Project Setup, and Model Training

Meituan Technology Team

Jun 23, 2022 · Artificial Intelligence

YOLOv6: An Efficient Industrial Object Detection Framework

YOLOv6, developed by Meituan's Vision Intelligence team, introduces a hardware‑friendly backbone, an efficient decoupled head, and advanced training strategies that together achieve up to 35.0% AP at 1242 FPS on COCO while outperforming YOLOv5, YOLOX and other same‑size models across multiple deployment platforms.

SIoU lossSimOTAYOLOv6

0 likes · 15 min read

YOLOv6: An Efficient Industrial Object Detection Framework

Python Programming Learning Circle

Feb 28, 2022 · Artificial Intelligence

Integrating MobileNet Series into YOLOv4 for Efficient Object Detection

This guide explains how to replace YOLOv4's CSPdarknet53 backbone with MobileNetV1, V2, or V3 networks, detailing the architecture analysis, code implementations, training setup, dataset preparation, and inference procedures for building a lightweight object detection model.

Deep LearningMobileNetYOLOv4

0 likes · 26 min read

Integrating MobileNet Series into YOLOv4 for Efficient Object Detection

Code DAO

Dec 22, 2021 · Artificial Intelligence

How Context R-CNN Leverages Temporal Context to Detect Occluded Objects

The article reviews the Context R-CNN paper, which introduces short‑term and long‑term memory banks and an attention mechanism to incorporate temporal context from multiple frames captured by a fixed camera, enabling robust detection of partially occluded, low‑light, distant, or background‑cluttered objects, and shows quantitative gains over standard Faster R‑CNN.

Attention MechanismContext R-CNNFaster R-CNN

0 likes · 6 min read

How Context R-CNN Leverages Temporal Context to Detect Occluded Objects

Code DAO

Nov 30, 2021 · Artificial Intelligence

How to Train a Custom Object Detector with PyTorch Faster R‑CNN

This article provides a step‑by‑step guide to building, training, and evaluating a custom object detection model using PyTorch Faster R‑CNN on a microcontroller dataset, covering data preparation, configuration, model modification, training loops, loss visualization, and inference on new images.

Faster R-CNNPyTorchPython

0 likes · 23 min read

How to Train a Custom Object Detector with PyTorch Faster R‑CNN

Python Programming Learning Circle

Nov 8, 2021 · Artificial Intelligence

YOLOv5 Tutorial: From YOLOv3 to YOLOv5, Code Walkthrough, Model Export (JIT & ONNX) and Usage

This article provides a comprehensive guide on YOLOv5, covering its background from YOLOv3, detailed code analysis of the model architecture, step‑by‑step instructions for running detect.py, configuring yolov5s.yaml, exporting the model to TorchScript JIT and ONNX formats, and practical inference examples using PyTorch and ONNX Runtime.

JITONNXPyTorch

0 likes · 16 min read

YOLOv5 Tutorial: From YOLOv3 to YOLOv5, Code Walkthrough, Model Export (JIT & ONNX) and Usage

Python Programming Learning Circle

Jul 17, 2021 · Artificial Intelligence

Implementing Object Detection with ImageAI in Just 10 Lines of Python

This tutorial demonstrates how to perform modern object detection using the ImageAI library with only ten lines of Python code, covering the underlying computer‑vision concepts, required dependencies, step‑by‑step installation, and detailed explanation of each code segment.

ImageAIobject detection

0 likes · 8 min read

Implementing Object Detection with ImageAI in Just 10 Lines of Python

Youku Technology

Jul 8, 2021 · Artificial Intelligence

Key Findings from Alibaba Moku Lab at ACM MM 2021

At ACM MM 2021, Alibaba’s Moku Lab presented four cutting‑edge studies: an interactive video inpainting system using user doodles, a decoupled IoU regression model for object detection, a spatio‑temporal distortion‑aware video quality assessment framework, and a multimodal emotional relationship recognition dataset and benchmark.

Video Inpaintingcomputer visionmultimodal emotion recognition

0 likes · 8 min read

Key Findings from Alibaba Moku Lab at ACM MM 2021

Miss Fresh Tech Team

Jul 8, 2021 · Artificial Intelligence

How AI Powers Smart Vending Cabinets: From RFID to Deep Learning Detection

This article details the evolution of intelligent vending cabinets, comparing RFID, gravity, dynamic and static vision solutions, and explains how deep‑learning models, data pipelines, and system architectures enable high‑accuracy, low‑loss product detection and automated operations in modern unmanned retail.

AIcomputer visionneural networks

0 likes · 36 min read

How AI Powers Smart Vending Cabinets: From RFID to Deep Learning Detection

TiPaiPai Technical Team

Jul 2, 2021 · Artificial Intelligence

How ContourNet and CenterNet Revolutionize Text Detection

This article explains the challenges of scene text detection and introduces two state‑of‑the‑art models, ContourNet and CenterNet, detailing their architectural innovations, loss functions, and how they overcome issues like extreme aspect ratios and anchor‑based inefficiencies.

CenterNetContourNetDeep Learning

0 likes · 7 min read

How ContourNet and CenterNet Revolutionize Text Detection

Alimama Tech

May 20, 2021 · Artificial Intelligence

How Alibaba’s AI Powers Brand Risk Detection: Models, Data, and Results

This article details Alibaba's AliMama brand risk identification system, covering the challenges of counterfeit detection, the construction of large‑scale brand datasets, the design of classification, logo detection, and variation models, their optimization, evaluation metrics, and future directions for AI‑driven brand protection.

AIAlibabaDeep Learning

0 likes · 22 min read

360 Tech Engineering

Apr 16, 2021 · Artificial Intelligence

Applying YOLOv5 Object Detection for Black, Color, and Normal Screen Classification in Video Frames

This article presents a method that replaces traditional manual video frame quality checks with an automated YOLOv5‑based object detection pipeline, detailing data labeling, model training, loss computation, inference code, and experimental results that show higher accuracy than ResNet for classifying black, color‑screen, and normal frames.

PythonYOLOv5image classification

0 likes · 12 min read

Applying YOLOv5 Object Detection for Black, Color, and Normal Screen Classification in Video Frames

360 Quality & Efficiency

Apr 16, 2021 · Artificial Intelligence

Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames

This article presents a method that replaces manual visual inspection with an automated YOLOv5‑based object detection pipeline to classify video frames as normal, colorful, or black screens, detailing data annotation, training, loss calculation, inference code, and showing a 97% accuracy improvement over ResNet.

Deep LearningPythonYOLOv5

0 likes · 11 min read

Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames

58 Tech

Mar 24, 2021 · Artificial Intelligence

Automated Detection of Illegal Watermarks in Images Using Deep Learning at 58.com

This article describes how 58.com built an end‑to‑end deep‑learning watermark detection service, covering business needs, data collection and augmentation, model selection and iterative improvements (Faster‑RCNN, SSD, YOLOv3, anchor‑free methods), deployment results, and future research directions.

Image ModerationModel Optimizationcomputer vision

0 likes · 14 min read

Automated Detection of Illegal Watermarks in Images Using Deep Learning at 58.com

New Oriental Technology

Nov 9, 2020 · Artificial Intelligence

Understanding YOLOv4 and YOLOv5: Core Elements and Innovations in Object Detection

This article introduces the fundamentals of object detection, explains the latest YOLOv4 and YOLOv5 architectures, and details the essential components—including data preparation, regularization, backbone, neck, and prediction innovations—along with label smoothing and advanced loss functions for improved detection performance.

AIYOLOv4YOLOv5

0 likes · 9 min read

Understanding YOLOv4 and YOLOv5: Core Elements and Innovations in Object Detection

Suning Technology

Oct 15, 2020 · Artificial Intelligence

How AI Powers Offline Product Recognition in Smart Retail Stores

This lecture details the evolution of product recognition algorithms from traditional image classification to deep‑learning‑based object detection, discusses challenges in dense retail scenes, presents solutions like rotated bounding boxes and multi‑source sensor fusion, and explains practical deployment in digital and unmanned stores.

Deep Learningdense sceneobject detection

0 likes · 18 min read

How AI Powers Offline Product Recognition in Smart Retail Stores

Suning Technology

Oct 2, 2020 · Artificial Intelligence

How Precise Customer‑Flow Algorithms Transform Retail with AI Vision

This article explains how AI‑driven precise customer‑flow algorithms—covering pedestrian detection, full‑scene tracking, and person re‑identification—enable accurate offline traffic analysis, real‑time shopper profiling, and data‑driven store management for modern retail environments.

customer flowmulti-camera trackingobject detection

0 likes · 18 min read

How Precise Customer‑Flow Algorithms Transform Retail with AI Vision

360 Quality & Efficiency

Sep 18, 2020 · Artificial Intelligence

Data Augmentation Techniques for Improving Object Detection Model Robustness

To enhance object detection robustness, the article discusses various data augmentation methods—including rotation, flipping, random cropping, scaling, color jitter, blurring, transparency adjustment, and image partitioning—providing code examples and illustrating their impact on model performance with before‑and‑after results.

Pythoncomputer visiondata augmentation

0 likes · 7 min read

Data Augmentation Techniques for Improving Object Detection Model Robustness

Alibaba Terminal Technology

Jul 1, 2020 · Artificial Intelligence

Detect Front‑End UI Components with Pipcook: A Complete Object‑Detection Guide

This tutorial walks you through using Pipcook to train an object‑detection model that automatically identifies and locates front‑end UI components in screenshots, covering data preparation in Pascal VOC format, pipeline configuration, model training, and inference with sample code.

Pascal VOCPipcookfrontend components

0 likes · 12 min read

Detect Front‑End UI Components with Pipcook: A Complete Object‑Detection Guide

Taobao Frontend Technology

Jun 2, 2020 · Artificial Intelligence

How AI Can Auto‑Detect UI Components for Seamless Front‑End Code Generation

This article explains how to use deep‑learning object detection to automatically recognize UI components in design drafts, generate a smart JSON description, and convert it into component‑based front‑end code, covering problem analysis, dataset preparation, algorithm selection, model training, evaluation, and deployment.

AIPipcookUI detection

0 likes · 30 min read

How AI Can Auto‑Detect UI Components for Seamless Front‑End Code Generation

Meituan Technology Team

May 21, 2020 · Artificial Intelligence

CenterMask: Single-Shot Instance Segmentation with Point Representation

CenterMask is a single‑shot, anchor‑free instance segmentation framework that predicts a coarse shape from each object’s center point and a full‑image saliency map, multiplies them to produce precise masks, and achieves competitive COCO AP while running faster than two‑stage methods like Mask R-CNN.

CenterMaskDeep Learningobject detection

0 likes · 15 min read

CenterMask: Single-Shot Instance Segmentation with Point Representation

HomeTech

Apr 8, 2020 · Artificial Intelligence

Application of Deep Learning for Cover Image Selection in Autohome Forum Articles

This paper presents a deep learning-based approach for selecting cover images in Autohome forum articles, employing Faster R-CNN for object detection, Mask R-CNN for human keypoint detection, and MobileNetV2 for attribute recognition, achieving an overall accuracy of 81.5%.

Cover Image SelectionMobileNetkeypoint detection

0 likes · 15 min read

Application of Deep Learning for Cover Image Selection in Autohome Forum Articles

Tencent Cloud Developer

Mar 6, 2020 · Artificial Intelligence

WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment

The paper presents a lightweight, anchor‑free CenterNet‑based object‑ness detector for WeChat’s Scan feature, built on a ShuffleNetV2 backbone with enlarged 5×5 depth‑wise convolutions, a streamlined detection head, and a Pyramid Interpolation Module, then quantized, ONNX‑converted and NCNN‑deployed to achieve a 436 KB model running in ~15 ms per frame on an iPhone 8 CPU.

CenterNetModel OptimizationReal-time inference

0 likes · 12 min read

WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment

DataFunTalk

Feb 20, 2020 · Artificial Intelligence

Perception Technology for Autonomous Heavy Trucks: Methods, Challenges, and Production Considerations

This article reviews perception technologies used in autonomous heavy‑truck systems—including lane‑line detection, obstacle detection, and LiDAR sensing—detailing traditional and deep‑learning approaches, practical challenges on high‑speed highways, and the cost, performance, and reliability issues faced when moving these solutions to mass production.

Deep LearningLiDARPerception

0 likes · 16 min read

Perception Technology for Autonomous Heavy Trucks: Methods, Challenges, and Production Considerations

58 Tech

Jan 15, 2020 · Artificial Intelligence

Mobile AI Vehicle and VIN Recognition: From TensorFlow to TensorFlow Lite Deployment on Android and iOS

This article details how the 58 Used‑Car mobile team built, trained, and optimized TensorFlow‑based object‑detection models for on‑device vehicle and VIN code recognition, covering data preparation, model conversion to TF‑Lite, performance improvements, engineering integration on Android/iOS, and real‑world deployment results.

AndroidTensorFlowTensorFlow Lite

0 likes · 14 min read

Mobile AI Vehicle and VIN Recognition: From TensorFlow to TensorFlow Lite Deployment on Android and iOS

Tencent Cloud Developer

Dec 26, 2019 · Artificial Intelligence

WeChat Scan-to-Identify (Scan Object) Feature: Overview, Technical Architecture, Data Construction, and Algorithmic Advances

WeChat’s iOS Scan‑to‑Identify feature lets users point a camera at any product or scene to instantly retrieve related e‑commerce, encyclopedia or news content, using a four‑pipeline architecture that builds massive annotated and deduplicated databases, advanced RetinaNet‑based detection, multi‑task metric learning, and scalable training, deployment and scheduling platforms, with plans to extend into domains like facial, vehicle and plant recognition.

AIWeChatcomputer vision

0 likes · 34 min read

WeChat Scan-to-Identify (Scan Object) Feature: Overview, Technical Architecture, Data Construction, and Algorithmic Advances

Alibaba Cloud Developer

Dec 20, 2019 · Artificial Intelligence

How AI-Powered Hand Gesture Detection Drove a Double‑11 Celebrity Rock‑Paper‑Scissors Game

This article details how Alibaba leveraged AI-driven hand‑gesture detection and a lightweight SSD‑based object detection model to create an interactive rock‑paper‑scissors game for Double‑11, addressing challenges of undefined gestures, real‑time mobile performance, and data collection, and achieving over 16 million page views and high accuracy.

Real-time inferenceSSDfeature pyramid network

0 likes · 22 min read

How AI-Powered Hand Gesture Detection Drove a Double‑11 Celebrity Rock‑Paper‑Scissors Game

Alibaba Cloud Developer

Dec 5, 2019 · Artificial Intelligence

Mastering Object Detection: From R-CNN to YOLO and Real-World AI Applications

This article introduces the fundamentals of object detection, explains key models such as R-CNN, Fast R-CNN, Faster R-CNN, YOLO, and SSD, and showcases how Alibaba's AI technology is applied to photovoltaic quality inspection to boost efficiency and accuracy in industry.

AIFast R-CNNFaster R-CNN

0 likes · 8 min read

Mastering Object Detection: From R-CNN to YOLO and Real-World AI Applications

DataFunTalk

Nov 27, 2019 · Artificial Intelligence

Front‑Fusion Based Recognition Pipeline for High‑Precision Map Static Obstacle Detection

This article presents a comprehensive front‑fusion recognition pipeline for high‑definition map static obstacle detection, detailing depth‑aware mapping, precise multi‑sensor calibration, point‑cloud registration, and semi‑supervised learning techniques that improve detection accuracy over traditional image‑only methods.

AIHD mapSemi-supervised Learning

0 likes · 11 min read

Front‑Fusion Based Recognition Pipeline for High‑Precision Map Static Obstacle Detection

DataFunTalk

Nov 14, 2019 · Artificial Intelligence

Sample Imbalance and Importance in Object Detection: IoU‑Balanced Sampling and Prime Sample Attention

The talk analyzes sample imbalance and importance in object detection, proposes IoU‑balanced negative sampling and instance‑balanced positive sampling, introduces the Prime Sample concept with Hierarchical Local Rank, and presents Importance‑based Sample Reweighting and Classification‑Aware Regression Loss, achieving consistent mAP gains without extra overhead.

IoU-balanced samplingcomputer visionhard mining

0 likes · 22 min read

Sample Imbalance and Importance in Object Detection: IoU‑Balanced Sampling and Prime Sample Attention

Amap Tech

Nov 14, 2019 · Artificial Intelligence

Technical Evolution of Ground Marking Recognition for High‑Precision Maps

AMap’s ground‑marking recognition has progressed from simple threshold methods to advanced deep‑learning pipelines—including two‑stage R‑FCN, cascade detectors with local regression, corner‑point and segmentation hybrids, and LiDAR‑based 3‑D PointRCNN—achieving over 99 % recall and sub‑5 cm positional accuracy for high‑precision map production.

Deep Learningcomputer visionground marking

0 likes · 15 min read

Technical Evolution of Ground Marking Recognition for High‑Precision Maps

Xianyu Technology

Sep 12, 2019 · Artificial Intelligence

Deep Learning for Automated Module Detection in Taobao 99 Promotion Pages

This study presents a deep‑learning pipeline that employs a Cascade‑RCNN with Feature Pyramid Network to automatically detect and refine modules and their internal elements on Taobao’s 99‑promotion pages, achieving roughly 98 % precision and recall on a thousand‑image validation set and paving the way for broader e‑commerce event applications.

Cascade R-CNNDeep LearningTaobao

0 likes · 7 min read

Deep Learning for Automated Module Detection in Taobao 99 Promotion Pages

Huawei Cloud Developer Alliance

Aug 30, 2019 · Artificial Intelligence

Distinguish Leopards vs Jaguars Using ModelArts AutoML

This guide walks you through configuring Huawei Cloud ModelArts, creating an AutoML image‑classification project, labeling cat and leopard photos, training a model, deploying it as an online service, and testing predictions to accurately differentiate between leopards and jaguars.

AIHuawei CloudModelArts

0 likes · 5 min read

Distinguish Leopards vs Jaguars Using ModelArts AutoML

Xianyu Technology

Jul 9, 2019 · Artificial Intelligence

Complex Background Content Extraction Using Detection and GAN Networks

The proposed UI2CODE pipeline first recalls UI elements with an object detector, then uses gradient cues to separate simple from complex regions and applies an SRGAN to restore foreground details in challenging backgrounds, achieving higher precision, recall, and localization than GrabCut and Deeplab, though it demands extensive multi‑scale training data.

AIGaNImage processing

0 likes · 4 min read

Complex Background Content Extraction Using Detection and GAN Networks

DataFunTalk

Jun 26, 2019 · Artificial Intelligence

Pony.ai Perception System: Combining Traditional and Deep Learning Methods for 2D and 3D Object Detection

This article outlines Pony.ai's perception pipeline, comparing traditional and deep‑learning approaches for 2D and 3D object detection, detailing sensor fusion, detection methods, challenges such as occlusion and distance estimation, and how hybrid techniques improve accuracy for autonomous driving.

3D detectionPerceptionautonomous driving

0 likes · 11 min read

Pony.ai Perception System: Combining Traditional and Deep Learning Methods for 2D and 3D Object Detection

Alibaba Cloud Developer

Jun 12, 2019 · Artificial Intelligence

How YOLOv3 Boosts Video Content Advertising on Youku: A Real‑World Case Study

By integrating YOLOv3 video object detection into Youku’s ad platform, the team replaced traditional subtitle‑based and scene‑based placements with precise object‑level targeting, achieving higher relevance, expanded inventory, and a 20% click‑through increase despite 3.5× higher exposure.

Deep LearningYOLOv3computer vision

0 likes · 14 min read

How YOLOv3 Boosts Video Content Advertising on Youku: A Real‑World Case Study

NetEase Media Technology Team

Apr 26, 2019 · Artificial Intelligence

Intelligent Cover Image Selection System for News Articles: Image Quality Assessment and Smart Cropping

The article describes an intelligent cover‑image selection system for NetEase News that automatically filters unsuitable illustrations, assesses image quality with a pairwise‑trained deep model across clarity, color and composition, and smartly crops images using aspect‑ratio‑aware object detection, dramatically cutting manual editing and enabling confidence‑based automatic publishing.

Image CroppingNeural Networkcomputer vision

0 likes · 11 min read

Intelligent Cover Image Selection System for News Articles: Image Quality Assessment and Smart Cropping

HomeTech

Apr 18, 2019 · Artificial Intelligence

An Overview of Image Processing Techniques and Common Tools for Beginners

This article provides a concise introduction to image processing, covering its hierarchical structure, fundamental techniques such as classification, detection, segmentation, geometric transformation, and the most widely used libraries and deep‑learning frameworks for newcomers.

Image processingcomputer visionimage classification

0 likes · 9 min read

An Overview of Image Processing Techniques and Common Tools for Beginners

Tencent Cloud Developer

Apr 16, 2019 · Artificial Intelligence

Building Image Recognition Systems: From Basics to Advanced AI Techniques

This article summarizes a computer‑vision salon where Dr. Ji Yongnan explains imaging pipelines, traditional feature‑based methods, deep‑learning breakthroughs, Tencent Cloud AI services, real‑world case studies, and answers audience questions about machine‑vision versus computer‑vision and data‑scarcity challenges.

AI ApplicationsDeep LearningSegmentation

0 likes · 18 min read

Building Image Recognition Systems: From Basics to Advanced AI Techniques

Hulu Beijing

Apr 2, 2019 · Artificial Intelligence

From Object Detection to Language Models: A Deep Dive into AI Advances

This article surveys the evolution of object detection models—comparing one‑stage and two‑stage approaches, their performance trade‑offs, and recent state‑of‑the‑art methods—while also outlining key concepts and breakthroughs in natural language processing, highlighting the impact of deep‑learning models such as BERT.

AI researchBERTDeep Learning

0 likes · 14 min read

From Object Detection to Language Models: A Deep Dive into AI Advances

DataFunTalk

Mar 15, 2019 · Artificial Intelligence

A Comprehensive Overview of Deep Learning Applications in Computer Vision

This article provides an extensive review of deep learning techniques applied to computer vision, covering the evolution of CNN architectures, image and video processing tasks, 2.5‑D and 3‑D reconstruction, object detection, segmentation, tracking, SLAM, and various practical applications such as AR, content retrieval, and autonomous driving.

CNNImage processingSLAM

0 likes · 22 min read

A Comprehensive Overview of Deep Learning Applications in Computer Vision