Tagged articles

650 articles

Page 5 of 7

Jul 8, 2021 · Artificial Intelligence

How AI Powers Smart Vending Cabinets: From RFID to Deep Learning Detection

This article details the evolution of intelligent vending cabinets, comparing RFID, gravity, dynamic and static vision solutions, and explains how deep‑learning models, data pipelines, and system architectures enable high‑accuracy, low‑loss product detection and automated operations in modern unmanned retail.

AIComputer VisionNeural Networks

0 likes · 36 min read

How AI Powers Smart Vending Cabinets: From RFID to Deep Learning Detection

New Oriental Technology

Jul 8, 2021 · Artificial Intelligence

Paper Detection and Perspective Correction Using OpenCV.js

This article introduces OpenCV.js, explains its basic concepts and demonstrates a complete workflow for detecting and correcting paper images in the browser using JavaScript, including matrix handling, resizing, filtering, edge detection, contour analysis, perspective transformation, and discusses challenges such as noise and incomplete edges.

Computer VisionImage ProcessingJavaScript

0 likes · 10 min read

Paper Detection and Perspective Correction Using OpenCV.js

TiPaiPai Technical Team

Jul 2, 2021 · Artificial Intelligence

How Graph Neural Networks Revolutionize Arbitrary‑Shaped Text Detection

This article reviews two recent computer‑vision approaches—DRRG and STKM—that combine CNN backbones with graph‑based relational reasoning and self‑attention to achieve state‑of‑the‑art detection of arbitrarily shaped text in images.

CNNComputer VisionDeep Learning

0 likes · 11 min read

How Graph Neural Networks Revolutionize Arbitrary‑Shaped Text Detection

TiPaiPai Technical Team

Jul 2, 2021 · Artificial Intelligence

How ContourNet and CenterNet Revolutionize Text Detection

This article explains the challenges of scene text detection and introduces two state‑of‑the‑art models, ContourNet and CenterNet, detailing their architectural innovations, loss functions, and how they overcome issues like extreme aspect ratios and anchor‑based inefficiencies.

CenterNetComputer VisionContourNet

0 likes · 7 min read

How ContourNet and CenterNet Revolutionize Text Detection

TiPaiPai Technical Team

Jun 28, 2021 · Artificial Intelligence

How Deep Learning Unwarps Twisted Document Images: DocUNet & DewarpNet Explained

This article reviews two end‑to‑end deep‑learning approaches—DocUNet (CVPR 2018) and DewarpNet (ICCV 2019)—for correcting warped document images, detailing their network architectures, synthetic data generation, loss functions, experimental results, and the remaining challenges in document dewarping.

Computer VisionDeep LearningImage Processing

0 likes · 14 min read

How Deep Learning Unwarps Twisted Document Images: DocUNet & DewarpNet Explained

TAL Education Technology

Jun 24, 2021 · Artificial Intelligence

GoodFuture AI Institute Wins Four International Championships at CVPR 2021 Across Multiple Vision Challenges

GoodFuture AI Institute secured four international titles at CVPR 2021—including Person In Context, UG²+, ETH‑XGaze, and ActivityNet—showcasing world‑class computer‑vision algorithms for human‑object interaction, low‑light face detection, gaze estimation, and active speaker detection, and highlighting their deployment in educational AI solutions.

AI competitionActive Speaker DetectionCVPR

0 likes · 9 min read

GoodFuture AI Institute Wins Four International Championships at CVPR 2021 Across Multiple Vision Challenges

Alimama Tech

Jun 24, 2021 · Artificial Intelligence

One‑Stage Training for Generative Adversarial Networks (OSGAN): Methodology and Efficiency Analysis

The OSGAN method introduced by Alibaba’s Mama team and Prof. Song Ming‑Li merges generator and discriminator updates into a single stage, cutting GAN training time by roughly 1.5‑1.7× while maintaining performance, and is validated on symmetric and asymmetric DCGANs with open‑source code.

Computer VisionDeep LearningGAN

0 likes · 10 min read

One‑Stage Training for Generative Adversarial Networks (OSGAN): Methodology and Efficiency Analysis

Alibaba Cloud Developer

Jun 22, 2021 · Artificial Intelligence

Turning Parking Cameras into AI‑Powered Safety Guardians

A Qingdao University student leveraged Intel Xeon SG1 GPU, OpenVINO and Mask R-CNN to transform existing parking‑lot cameras into an intelligent system that counts vehicles, detects pedestrians in blind spots, and issues real‑time safety alerts, showcasing a practical AI solution for child safety in crowded parking areas.

AIComputer VisionIntel Xeon

0 likes · 5 min read

Turning Parking Cameras into AI‑Powered Safety Guardians

TiPaiPai Technical Team

Jun 17, 2021 · Artificial Intelligence

From Pixels to Words: The Evolution and Challenges of Text Detection

This article traces the origins, unique difficulties, method classifications, and current advancements of scene text detection, highlighting how AI has enabled computers to read images and the ongoing research to improve accuracy, speed, and multilingual support.

AIComputer VisionDeep Learning

0 likes · 8 min read

From Pixels to Words: The Evolution and Challenges of Text Detection

Xianyu Technology

Jun 9, 2021 · Artificial Intelligence

Applying Visual AI Techniques for Image Quality and Duplicate Detection in Xianyu Marketplace

By deploying large‑scale visual AI—including a ResNet‑101 classifier, ArcFace‑trained matching features, clustering‑based sub‑category refinement, and product‑level image indexing—Xianyu’s marketplace dramatically improves image quality, removes duplicates, enhances search relevance and feed diversity, and filters non‑compliant content.

Computer VisionDeep LearningImage Classification

0 likes · 16 min read

Applying Visual AI Techniques for Image Quality and Duplicate Detection in Xianyu Marketplace

Amap Tech

Jun 4, 2021 · Artificial Intelligence

Deploying Multiple CNN Models on Low‑End Devices with MNN: Memory Tricks and Error Debugging

This article explains how a high‑traffic map service captures road features using client‑side computer‑vision models, details the deployment of many CNNs with the lightweight MNN engine on memory‑constrained devices, and shares practical memory‑saving techniques, inference scheduling, and error‑analysis methods.

AndroidComputer VisionMNN

0 likes · 12 min read

Deploying Multiple CNN Models on Low‑End Devices with MNN: Memory Tricks and Error Debugging

Meituan Technology Team

Jun 3, 2021 · Artificial Intelligence

LargeFineFoodAI Workshop and Challenge at ICCV 2021

At ICCV 2021 in Montreal, the LargeFineFoodAI workshop—co‑organized by Meituan Vision Intelligence Center, the Chinese Academy of Sciences, Beijing Zhiyuan and the University of Barcelona—will showcase state‑of‑the‑art fine‑grained food image research, feature invited speakers Jain, Aizawa and Radeva, and host a $12,000 prize challenge on Food2K across recognition and retrieval tracks.

ChallengeComputer VisionDataset

0 likes · 7 min read

LargeFineFoodAI Workshop and Challenge at ICCV 2021

Test Development Learning Exchange

May 31, 2021 · Artificial Intelligence

Real-time Barcode Detection with Python, OpenCV, and Pyzbar

This article demonstrates how to use Python's OpenCV and Pyzbar libraries to capture video from a webcam, decode barcodes in real time, display the results, and save captured frames, providing a practical guide for implementing barcode recognition in a retail checkout scenario.

Computer VisionOpenCVPython

0 likes · 4 min read

Real-time Barcode Detection with Python, OpenCV, and Pyzbar

Meituan Technology Team

May 27, 2021 · Artificial Intelligence

Standardizing Food Delivery Dish Names: Knowledge Graph Construction and Applications

The paper outlines an end‑to‑end pipeline that standardizes highly personalized food‑delivery dish names by combining rule‑based and BERT‑DSSM text synonym detection with EfficientNet image classification, constructing a multi‑level taxonomy that improves aggregation, supply‑demand analysis, recall ranking and merchant tagging.

Computer VisionNLPentity extraction

0 likes · 17 min read

Standardizing Food Delivery Dish Names: Knowledge Graph Construction and Applications

Tencent Advertising Technology

May 27, 2021 · Artificial Intelligence

Multimodal Video Ad Second-Level Parsing: Algorithm Design and Baseline Analysis for the 2021 Tencent Advertising Algorithm Competition

This article details the algorithmic framework and baseline models for the 2021 Tencent Advertising Algorithm Competition, focusing on multimodal video ad parsing through temporal localization, scene segmentation, and multi-label classification to enhance advertising effectiveness and creative analysis.

Computer VisionTemporal Segmentationadvertising technology

0 likes · 22 min read

Multimodal Video Ad Second-Level Parsing: Algorithm Design and Baseline Analysis for the 2021 Tencent Advertising Algorithm Competition

Cyber Elephant Tech Team

May 26, 2021 · Artificial Intelligence

Can GANs Eliminate Motion Blur? A Deep Learning Approach to Image Deblurring

This article reviews a GAN‑based deep learning method for removing motion blur from images, covering the problem definition, related work, the multi‑scale generator and discriminator architecture, loss functions, the GoPro dataset, and experimental results that demonstrate clear visual improvements.

Computer VisionDeep LearningGAN

0 likes · 11 min read

Can GANs Eliminate Motion Blur? A Deep Learning Approach to Image Deblurring

Kuaishou Tech

May 24, 2021 · Artificial Intelligence

BCNet: A Bilayer Instance Segmentation Network for Occlusion‑Aware Object Detection

The paper proposes BCNet, a lightweight bilayer instance segmentation network that explicitly models occluder and occludee relationships by treating each region of interest as two overlapping layers, achieving significant performance gains on COCO, COCOA and KINS datasets under heavy occlusion.

Computer VisionDeep Learningbilayer network

0 likes · 10 min read

BCNet: A Bilayer Instance Segmentation Network for Occlusion‑Aware Object Detection

Alimama Tech

May 20, 2021 · Artificial Intelligence

How Alibaba’s AI Powers Brand Risk Detection: Models, Data, and Results

This article details Alibaba's AliMama brand risk identification system, covering the challenges of counterfeit detection, the construction of large‑scale brand datasets, the design of classification, logo detection, and variation models, their optimization, evaluation metrics, and future directions for AI‑driven brand protection.

AIAlibabaComputer Vision

0 likes · 22 min read

Kuaishou Large Model

May 13, 2021 · Artificial Intelligence

How Regressive Domain Adaptation Boosts Unsupervised Keypoint Detection

This article reviews the CVPR2021 paper on Regressive Domain Adaptation (RegDA) for unsupervised keypoint detection, explaining its motivation, novel adversarial regression framework, sparse output-space modeling, min‑min training strategy, extensive experiments, and the resulting performance gains across multiple datasets.

Computer VisionUnsupervised Learningdomain adaptation

0 likes · 13 min read

How Regressive Domain Adaptation Boosts Unsupervised Keypoint Detection

Kuaishou Tech

May 10, 2021 · Artificial Intelligence

Semantic Image Matting: Integrating Alpha Pattern Semantics into the Matting Framework

The article presents Semantic Image Matting, a novel approach that incorporates 20 semantic Alpha pattern categories into the matting pipeline via semantic Trimap, region‑based classifiers, multi‑class discriminators, and learnable gradient loss, achieving state‑of‑the‑art results on multiple benchmarks.

Computer VisionDeep Learningalpha patterns

0 likes · 11 min read

Semantic Image Matting: Integrating Alpha Pattern Semantics into the Matting Framework

JD Cloud Developers

Apr 30, 2021 · Artificial Intelligence

How Face Keypoint Localization Advances Under Masked Conditions: Insights from JD AI’s 3rd Competition

The JD AI Institute and ICME2021 concluded their third face keypoint localization contest, emphasizing efficient masked‑face detection to aid COVID‑19 contact tracing, attracting top universities and tech firms, expanding data scale, and tightening model efficiency constraints to push the field forward.

AI competitionComputer VisionDeep Learning

0 likes · 4 min read

How Face Keypoint Localization Advances Under Masked Conditions: Insights from JD AI’s 3rd Competition

Amap Tech

Apr 23, 2021 · Artificial Intelligence

Design Principles and Implementation of Gaode AR Navigation

The article explains Gaode Maps’ AR navigation design, detailing how environmental factors, spatial experience, color hierarchy, safety considerations, and competitor insights shape a six‑point design framework, and describes prototype testing, implementation strategies for overlapping alerts, and future prospects such as virtual road barriers and multimodal travel.

AR navigationComputer VisionUser experience

0 likes · 8 min read

Design Principles and Implementation of Gaode AR Navigation

Kuaishou Tech

Apr 16, 2021 · Artificial Intelligence

Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration

Camera-space hand mesh recovery (CMR) leverages semantic aggregation of 2D cues and adaptive 2D‑1D registration to predict absolute 3D hand pose and shape directly in camera coordinates, improving accuracy on benchmarks such as FreiHAND, RHD, and Human3.6M.

2D-1D registration3D reconstructionComputer Vision

0 likes · 17 min read

Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration

360 Quality & Efficiency

Apr 16, 2021 · Artificial Intelligence

Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames

This article presents a method that replaces manual visual inspection with an automated YOLOv5‑based object detection pipeline to classify video frames as normal, colorful, or black screens, detailing data annotation, training, loss calculation, inference code, and showing a 97% accuracy improvement over ResNet.

Computer VisionDeep LearningImage Classification

0 likes · 11 min read

Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames

MaGe Linux Operations

Apr 13, 2021 · Artificial Intelligence

Top 10 Free Python Libraries for Image Processing You Should Try

Discover ten essential, free Python libraries for image processing—from scikit-image and NumPy to OpenCV-Python and Pycairo—each with resources, usage examples, and visual demonstrations, enabling you to manipulate, analyze, and transform images efficiently for computer vision and data science projects.

Computer VisionImage ProcessingOpenCV

0 likes · 12 min read

Top 10 Free Python Libraries for Image Processing You Should Try

58UXD

Apr 12, 2021 · Artificial Intelligence

How 58.com Built an AI Designer: From Smart Cutout to Intelligent Creative Platform

This article chronicles 58.com’s journey from a small brainstorming room to a full‑scale AI design platform, detailing the development of smart cutout, the BASNet segmentation model, custom loss functions, template editing, and the measurable business impact of the AI designer.

AI designBASNetComputer Vision

0 likes · 15 min read

How 58.com Built an AI Designer: From Smart Cutout to Intelligent Creative Platform

DataFunTalk

Apr 10, 2021 · Artificial Intelligence

2020 Computer Vision Breakthroughs: Self‑Supervised Learning, Transformer Attention Modeling, and Neural Radiance Fields

The talk reviews three major 2020 advances in computer vision—self‑supervised learning surpassing supervised pre‑training, the successful adoption of Transformer‑based attention models for detection and classification, and the emergence of Neural Radiance Fields for view synthesis—while highlighting related research from Microsoft Research Asia and the broader community.

2020AI breakthroughsComputer Vision

0 likes · 19 min read

2020 Computer Vision Breakthroughs: Self‑Supervised Learning, Transformer Attention Modeling, and Neural Radiance Fields

Youku Technology

Apr 8, 2021 · Artificial Intelligence

Champion Solution of Media AI Alibaba Entertainment Video Object Segmentation Challenge

The Youku AI team won the Media AI Alibaba Entertainment Video Object Segmentation Challenge by enhancing the STM model with a spatial‑constrained memory reader, ASPP‑HRNet refinement, ResNeSt‑101 backbone, and a multi‑stage training pipeline, while also devising an unsupervised framework that combines DetectoRS detection, HRNet mask refinement, STM‑based association, and key‑frame optimization to achieve 95.5% test score on a large, richly annotated video dataset.

Computer VisionDeep LearningSemi-supervised Learning

0 likes · 13 min read

Champion Solution of Media AI Alibaba Entertainment Video Object Segmentation Challenge

Kuaishou Tech

Apr 6, 2021 · Artificial Intelligence

Frequency-Aware Feature Learning with Single-Center Loss for Face Forgery Detection

Researchers from USTC and Kuaishou propose a frequency‑aware feature learning framework that combines a data‑driven adaptive frequency module with a novel single‑center loss, achieving state‑of‑the‑art performance on deepfake detection while addressing class‑distribution challenges.

AI securityComputer Visiondeepfake detection

0 likes · 7 min read

Frequency-Aware Feature Learning with Single-Center Loss for Face Forgery Detection

Kuaishou Large Model

Apr 1, 2021 · Artificial Intelligence

How Kuaishou Y‑Tech Leverages GANs for Real‑Time Face Attribute Editing in Short Videos

This article details Kuaishou Y‑Tech's practical deployment of GAN‑based high‑precision face attribute editing—covering gender, age, hair, and expression transformations—for short‑video effects, discussing background, business applications, technical challenges, and solutions across data preparation, model training, and mobile deployment.

Computer VisionGANKuaishou

0 likes · 15 min read

How Kuaishou Y‑Tech Leverages GANs for Real‑Time Face Attribute Editing in Short Videos

iQIYI Technical Product Team

Mar 26, 2021 · Artificial Intelligence

Insights into OCR Technology at iQIYI: Development, Challenges, and Applications

iQIYI’s OCR journey, explained by researcher Harlon, covers the evolution from separate detection and recognition pipelines to end‑to‑end models, key algorithms like CTPN, DB and CRNN, large‑scale simulated training, diverse video‑text applications, and future goals such as mobile deployment and tighter NLP integration.

AIComputer VisionDeep Learning

0 likes · 21 min read

Insights into OCR Technology at iQIYI: Development, Challenges, and Applications

58 Tech

Mar 24, 2021 · Artificial Intelligence

Automated Detection of Illegal Watermarks in Images Using Deep Learning at 58.com

This article describes how 58.com built an end‑to‑end deep‑learning watermark detection service, covering business needs, data collection and augmentation, model selection and iterative improvements (Faster‑RCNN, SSD, YOLOv3, anchor‑free methods), deployment results, and future research directions.

Computer VisionImage ModerationModel Optimization

0 likes · 14 min read

Automated Detection of Illegal Watermarks in Images Using Deep Learning at 58.com

Huawei Cloud Developer Alliance

Mar 23, 2021 · Artificial Intelligence

How to Recognize Credit Card Numbers with OpenCV: A Step‑by‑Step Tutorial

This tutorial walks through a project‑based OpenCV workflow that reads a digit template, preprocesses both template and credit‑card images, extracts individual numbers, matches them against the template, and finally overlays the recognized digits onto the original image, illustrating core computer‑vision techniques.

Computer VisionImage ProcessingOCR

0 likes · 10 min read

How to Recognize Credit Card Numbers with OpenCV: A Step‑by‑Step Tutorial

Amap Tech

Mar 22, 2021 · Artificial Intelligence

Visual Technology for Automated POI Name Generation: STR, Text Detection, and Naming Practices

Amap’s visual‑technology pipeline automatically generates and updates POI names by crowdsourcing street‑level images, applying deep‑learning scene‑text recognition, dual‑branch classification of text attributes, and a BERT‑plus‑graph‑attention model that selects and orders recognized text, achieving about 95 % naming accuracy.

Computer VisionDeep LearningName Generation

0 likes · 14 min read

Visual Technology for Automated POI Name Generation: STR, Text Detection, and Naming Practices

JD Cloud Developers

Mar 8, 2021 · Artificial Intelligence

How AI Voice Synthesis Brings ‘Hi, Mom’ to Life: From Film to Real‑World Tech

The article explores how modern AI technologies such as speech synthesis, natural language understanding, and the FastReID computer‑vision library enable realistic voice recreation and cross‑temporal dialogue, turning the emotional premise of the movie “Hi, Mom” into a tangible technical demonstration.

AIComputer VisionFastReID

0 likes · 10 min read

How AI Voice Synthesis Brings ‘Hi, Mom’ to Life: From Film to Real‑World Tech

Tencent Cloud Developer

Mar 4, 2021 · Artificial Intelligence

WeChat OCR: Implementation of Image Text Extraction Feature

WeChat’s 8.0 update introduced an OCR pipeline that first quickly detects text in images, classifies the image type, applies a lightweight multi‑language detection network and a MobileNetV3‑based DBNet recognizer with a multi‑task CTC/Attention model, then merges results via a rule‑based layout analyzer to deliver accurate, well‑formatted extracted text across diverse languages and document types.

Computer VisionDBNetDeep Learning

0 likes · 13 min read

WeChat OCR: Implementation of Image Text Extraction Feature

Laravel Tech Community

Feb 28, 2021 · Artificial Intelligence

How the “Ant Ya Hey” AI Effect Works and How to Create It

This article explains the popular Douyin AI effect “Ant Ya Hey”, showcases celebrity demos, provides a step‑by‑step guide using Avatarify and video editors, and delves into the underlying First‑Order Motion Model research that powers the realistic facial animation.

AIAvatarifyComputer Vision

0 likes · 6 min read

How the “Ant Ya Hey” AI Effect Works and How to Create It

Kuaishou Large Model

Feb 25, 2021 · Artificial Intelligence

How Kuaishou’s AI‑Powered Beauty Engine Transforms Real‑Time Video

This article details Kuaishou Y‑tech’s Gorgeous beauty platform, covering traditional smoothing, advanced skin‑tone effects, AI‑driven blemish removal, clarity enhancement, local facial tuning, and the UNet‑based GorgeousGAN that delivers one‑click high‑definition beauty for live‑stream and short‑video applications.

AI beautyComputer VisionDeep Learning

0 likes · 13 min read

How Kuaishou’s AI‑Powered Beauty Engine Transforms Real‑Time Video

360 Tech Engineering

Feb 23, 2021 · Artificial Intelligence

Video Stutter Detection via Frame Difference Analysis Using FFmpeg

This article explains a method for detecting video stutter by converting uploaded videos into frame sequences with ffmpeg, calculating pixel differences between consecutive frames, aggregating motion metrics, removing scene‑change effects, computing a dynamic factor, and outputting a binary result indicating the presence or absence of stutter.

Computer VisionVideo processingalgorithm

0 likes · 5 min read

Video Stutter Detection via Frame Difference Analysis Using FFmpeg

DataFunTalk

Feb 16, 2021 · Artificial Intelligence

Multimedia Content Understanding in Meitu Community: Video Classification, Fingerprinting, and OCR

This article presents Meitu Community's AI‑driven multimedia content analysis pipeline, covering short‑video classification, video fingerprinting, and OCR, detailing model choices, experimental results, and future directions for improving content audit, quality, tagging, and feature engineering.

AIComputer VisionFingerprinting

0 likes · 18 min read

Multimedia Content Understanding in Meitu Community: Video Classification, Fingerprinting, and OCR

JD Cloud Developers

Feb 10, 2021 · Artificial Intelligence

How JD Tech’s Breakthrough AI Papers Dominated AAAI 2021

JD Tech showcased a remarkable 21-paper presence at AAAI 2021, covering federated learning, spatio‑temporal AI, recommendation systems, computer vision, and causal learning, highlighting the company’s transition from research to real‑world AI applications across smart cities, retail, and risk management.

AAAI 2021Computer VisionFederated Learning

0 likes · 12 min read

How JD Tech’s Breakthrough AI Papers Dominated AAAI 2021

ByteFE

Feb 9, 2021 · Fundamentals

Curated Self‑Study Resources for Emerging Tech Fields (Multimedia, AI, CV, RL, MT, Knowledge Graph, Mobile, Frontend)

This guide compiles recommended books, courses, and open‑source projects across multimedia, artificial intelligence, computer vision, reinforcement learning, machine translation, knowledge graphs, Android, iOS, and frontend development to help newcomers and job seekers systematically deepen their technical expertise.

Artificial IntelligenceComputer VisionResources

0 likes · 12 min read

Curated Self‑Study Resources for Emerging Tech Fields (Multimedia, AI, CV, RL, MT, Knowledge Graph, Mobile, Frontend)

iQIYI Technical Product Team

Feb 5, 2021 · Game Development

AR+AI Powered Video Interactive Mini‑Games on iQIYI: Architecture, Face & Gesture Control, and Lua Game Layer

iQIYI’s AR+AI powered video interactive mini‑games blend a custom VideoAR engine with real‑time AI‑driven face and gesture detection, use lightweight Lua for game logic, and offer rapid hot‑updates, enabling diverse IP integrations that have attracted over a million participants and boosted viewer engagement.

AIARComputer Vision

0 likes · 12 min read

AR+AI Powered Video Interactive Mini‑Games on iQIYI: Architecture, Face & Gesture Control, and Lua Game Layer

Amap Tech

Feb 1, 2021 · Artificial Intelligence

AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis Using In-Vehicle Video

The AMAP‑TECH competition challenged participants to infer real‑time road conditions from in‑vehicle video, prompting the authors to combine lane‑wise vehicle detection with LightGBM and later an end‑to‑end DenseNet‑GRU model, augment data, ensemble five networks, and achieve a 0.7237 F1 score while outlining future deployment and research directions.

Computer VisionDeep LearningModel Deployment

0 likes · 15 min read

AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis Using In-Vehicle Video

Maoyan Technology Team

Feb 1, 2021 · Artificial Intelligence

Mastering Video Point Tracking: Image Registration & SiamMask for the Marlan Cup

This article details the Marlan Mountain Cup video point‑tracking challenge, describing the dataset, evaluation metrics, and a hybrid solution that combines SIFT‑based image registration with SiamMask tracking, along with extensive analysis and practical tricks for performance improvement.

Computer VisionMSE evaluationSiamMask

0 likes · 15 min read

Mastering Video Point Tracking: Image Registration & SiamMask for the Marlan Cup

Kuaishou Large Model

Jan 28, 2021 · Artificial Intelligence

How Portrait Deformation Powers Modern Beauty Filters: Algorithms Explained

This article explores the core portrait deformation techniques behind today’s beauty and body‑shaping effects—covering affine transforms, Moving Least Squares, triangulation, liquify, offset, 3D mesh, and deep‑learning approaches—detailing their principles, implementations, and visual results in live‑streaming and short‑video apps.

AIComputer VisionImage Processing

0 likes · 13 min read

How Portrait Deformation Powers Modern Beauty Filters: Algorithms Explained

iQIYI Technical Product Team

Jan 15, 2021 · Artificial Intelligence

How AI is Transforming Video Creation and Consumption at Scale

The article examines how iQIYI leverages AI across the video ecosystem—from intelligent material search, old‑film restoration, and voice cloning to virtual idols, XR production, and AI‑driven advertising—to boost creator efficiency, enhance user experience, and accelerate industry-wide digital transformation.

AIComputer VisionIndustry Insights

0 likes · 14 min read

How AI is Transforming Video Creation and Consumption at Scale

Amap Tech

Jan 15, 2021 · Artificial Intelligence

Solution Overview of the AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis from In‑Vehicle Video Images

To tackle the AMAP‑TECH competition’s dynamic road‑condition classification from scarce, imbalanced vehicle‑video frames, the team combined YOLOv5 object detection, ResNeXt101‑based semantic embeddings, and engineered temporal detection statistics, feeding the fused features into a five‑fold LightGBM model that achieved top weighted‑F1 performance.

Computer VisionLightGBMMultimodal Learning

0 likes · 10 min read

Solution Overview of the AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis from In‑Vehicle Video Images

Python Crawling & Data Mining

Jan 11, 2021 · Artificial Intelligence

Unlock Text from Images: A Hands‑On Guide to EasyOCR in Python

This article explains what OCR is, introduces the EasyOCR Python library, shows how to install it, walks through step‑by‑step usage with code examples, and summarizes the underlying deep‑learning techniques powering the library.

Computer VisionDeep LearningEasyOCR

0 likes · 6 min read

Unlock Text from Images: A Hands‑On Guide to EasyOCR in Python

Didi Tech

Dec 29, 2020 · Artificial Intelligence

Evolution and Challenges of Perception in L4 Autonomous Driving

The article traces L4 autonomous-driving perception from early rule-based point-cloud methods through data-driven deep-learning models to emerging self-learning, multi-task systems, and highlights four key hurdles—model generalization and explainability, robust multi-sensor fusion, real-time compute limits, and proper uncertainty handling—calling for integrated AI, engineering, and data solutions.

AIComputer VisionDeep Learning

0 likes · 12 min read

Evolution and Challenges of Perception in L4 Autonomous Driving

Selected Java Interview Questions

Dec 29, 2020 · Artificial Intelligence

Open-Source Video Object Removal Tool Using PyTorch Allows Deleting Elements via Bounding Boxes

An open‑source PyTorch‑based project enables users to remove unwanted objects from videos simply by drawing a bounding box around them, offering a practical demo, step‑by‑step instructions, and a GitHub repository with over 2 k stars.

Computer VisionObject RemovalOpen-source

0 likes · 2 min read

Open-Source Video Object Removal Tool Using PyTorch Allows Deleting Elements via Bounding Boxes

Meituan Technology Team

Dec 24, 2020 · Artificial Intelligence

Meituan Unmanned Delivery Technical Salon – AI Research on Instance Segmentation, Visual Localization, Trajectory Prediction, and Depth‑Pose Learning

On January 9, 2021, Meituan hosted an unmanned‑delivery technical salon in Beijing where experts presented cutting‑edge AI research—including the CenterMask instance‑segmentation method, 3D geometry‑aware camera localization, multi‑agent trajectory prediction with attention‑based spatio‑temporal graphs, real‑time stereo visual‑inertial odometry calibration, and self‑supervised depth‑pose learning for dynamic scenes.

AIComputer Visionautonomous driving

0 likes · 7 min read

Meituan Unmanned Delivery Technical Salon – AI Research on Instance Segmentation, Visual Localization, Trajectory Prediction, and Depth‑Pose Learning

Suning Technology

Dec 17, 2020 · Artificial Intelligence

How AI Powers SuNing’s Unmanned Stores: From Face Detection to Smart Retail

This article outlines SuNing's unmanned store technology, comparing its data-driven, product selection, and customer experience advantages over traditional shops, and detailing AI-powered applications such as face detection, target tracking, image recognition, and 3D reconstruction that enable 24‑hour service, intelligent merchandising, and precise customer analytics.

AIComputer VisionData Analytics

0 likes · 24 min read

How AI Powers SuNing’s Unmanned Stores: From Face Detection to Smart Retail

DataFunTalk

Dec 9, 2020 · Artificial Intelligence

WeChat Identify: From Object Detection to Large‑Scale Image Search – Technical Overview

This article details the evolution of WeChat’s Identify product, explaining its end‑to‑end image recognition pipeline—including object detection, multi‑label classification, mobile‑side detection, large‑scale retrieval, unsupervised clustering, and system architecture—while showcasing various application scenarios such as product, plant, and landmark recognition.

Computer VisionMobile AIWeChat

0 likes · 12 min read

WeChat Identify: From Object Detection to Large‑Scale Image Search – Technical Overview

Python Crawling & Data Mining

Dec 9, 2020 · Artificial Intelligence

Unlock 3D Human Pose Capture with FrankMocap: A Powerful Open‑Source AI Tool

FrankMocap, an open‑source AI algorithm from Facebook AI Research and HKU, delivers simultaneous 3D full‑body and hand pose estimation from a single monocular video, runs at about 9.5 FPS on a RTX 2080, and includes easy installation steps, code examples, and links to its GitHub repository and paper.

3D pose estimationComputer VisionOpen-source

0 likes · 6 min read

Unlock 3D Human Pose Capture with FrankMocap: A Powerful Open‑Source AI Tool

Top Architect

Dec 4, 2020 · Artificial Intelligence

Java-based ID Card OCR Project Using OpenCV, JavaCPP, and Tess4J

This article introduces a Java OCR project for ID cards that integrates OpenCV, JavaCPP, and Tess4J to perform image preprocessing, region cropping, and character recognition without requiring OpenCV installation, and details its features, encountered issues, system requirements, updates, and source repository.

Computer VisionID CardJavaCPP

0 likes · 4 min read

Java-based ID Card OCR Project Using OpenCV, JavaCPP, and Tess4J

iQIYI Technical Product Team

Dec 4, 2020 · Artificial Intelligence

AI‑Powered UI Element Recognition for Mobile App Automation Testing

iQIYI’s AI‑driven UI element recognizer uses a YOLOv3 model trained on thousands of mobile and PC screenshots to locate obfuscated controls across diverse devices, integrating predictions into its IDE for reliable automation of React Native, Flutter, H5 and mini‑program interfaces.

AIComputer VisionUI automation

0 likes · 12 min read

AI‑Powered UI Element Recognition for Mobile App Automation Testing

DataFunSummit

Dec 3, 2020 · Artificial Intelligence

GAN Fundamentals, Variants, and Practical Applications in Image Style Transfer and Handwriting Font Generation

This article provides a comprehensive overview of Generative Adversarial Networks, covering their original formulation, training dynamics, loss functions, major variants such as DCGAN and WGAN, and practical implementations for image‑to‑image translation, style transfer, and handwriting font synthesis at Laiye Technology.

Computer VisionDeep LearningGAN

0 likes · 28 min read

GAN Fundamentals, Variants, and Practical Applications in Image Style Transfer and Handwriting Font Generation

Kuaishou Large Model

Dec 3, 2020 · Artificial Intelligence

Kuaishou Y‑Tech’s Real‑Time, High‑Precision Facial & Body Keypoint Detection Explained

Y‑Tech’s in‑house keypoint detection system powers Kuaishou’s beauty and effect filters across live streaming, video creation, and editing by leveraging lightweight deep‑learning models, extensive multi‑scenario data collection, and specialized handling of occlusion, enabling real‑time, robust facial and body landmark tracking on diverse mobile devices.

Computer VisionDeep LearningMobile AI

0 likes · 10 min read

Kuaishou Y‑Tech’s Real‑Time, High‑Precision Facial & Body Keypoint Detection Explained

360 Quality & Efficiency

Nov 27, 2020 · Artificial Intelligence

Image Similarity Detection Methods: Hashing, Histograms, Feature Matching, BOW+K‑Means, and CNN‑Based Approaches

This article reviews common image similarity detection techniques—including hash-based methods (aHash, pHash, dHash), histogram comparison, feature matching with ORB and SIFT/SURF, bag‑of‑words with K‑Means, and CNN‑based VGG16 features—detailing their algorithms, Python implementations, performance characteristics, and practical considerations.

Computer VisionDeep LearningHashing

0 likes · 15 min read

Image Similarity Detection Methods: Hashing, Histograms, Feature Matching, BOW+K‑Means, and CNN‑Based Approaches

Suning Technology

Nov 26, 2020 · Artificial Intelligence

How Low-Cost AI Powers Full-Scale Store Digitalization

Li Yongxiang, technical director at Suning Tech, outlines how AI-driven visual unmanned stores and integrated big‑data, cloud, and edge computing solutions enable low‑cost digital transformation across thousands of retail outlets, improving shopper experience, inventory management, and operational efficiency.

AIComputer VisionEdge Computing

0 likes · 18 min read

How Low-Cost AI Powers Full-Scale Store Digitalization

DataFunTalk

Nov 22, 2020 · Artificial Intelligence

Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan

This article presents Meituan's AI-driven short video analysis workflow, covering industry trends, multi‑label video classification, intelligent cover selection, and video generation techniques, while discussing challenges, model building, label expansion, continuous data iteration, and future outlook for video AI in local services.

AIComputer VisionMeituan

0 likes · 16 min read

Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan

DeWu Technology

Nov 18, 2020 · Artificial Intelligence

AR Fundamentals and Shoe Try‑On Implementation

The presentation explains AR fundamentals, distinguishes it from AI and VR, and details a shoe‑try‑on system that captures 30 fps video, uses AI key‑point detection and pose estimation to overlay 3D shoe models—created via manual, scanning, or photogrammetry methods—rendered with GPU pipelines and PBR, enhanced by green‑screen occlusion and shadow techniques, earning positive audience feedback.

3D modelingARComputer Vision

0 likes · 7 min read

AR Fundamentals and Shoe Try‑On Implementation

Programmer DD

Nov 11, 2020 · Artificial Intelligence

When AI Cameras Mistake a Referee’s Bald Head for a Football: A Scottish Match Mishap

A Scottish football match turned comedic when an AI camera misidentified a linesman's shiny bald head as the ball, highlighting challenges in automated sports broadcasting and prompting fan backlash over the technology's shortcomings.

AIComputer VisionPixellot

0 likes · 5 min read

When AI Cameras Mistake a Referee’s Bald Head for a Football: A Scottish Match Mishap

DataFunTalk

Nov 10, 2020 · Artificial Intelligence

Low‑Power ADAS on Didi’s JueShi Devices Reduces Traffic Accidents

This article describes how Didi’s vehicle‑vision team built an ultra‑low‑power ADAS solution on the JueShi dash‑cam platform, using lightweight detection models, temporal fusion, camera‑calibration techniques and data‑driven optimization to cut rear‑end collision rates by over 11% and improve overall traffic safety.

ADASComputer VisionEdge Computing

0 likes · 15 min read

Low‑Power ADAS on Didi’s JueShi Devices Reduces Traffic Accidents

Didi Tech

Nov 9, 2020 · Artificial Intelligence

Ultra-Low-Power ADAS on DiDi's JueShi Devices for Reducing Traffic Accidents

DiDi’s ultra‑low‑power JueShi ADAS combines lightweight vision models, temporal‑fusion Kalman filtering, and camera‑calibration techniques to deliver real‑time forward‑collision warnings and brake‑light alerts, cutting rear‑end crashes by over 11% and overall accidents by 9% through continuous edge‑AI learning.

ADASComputer VisionEdge Computing

0 likes · 15 min read

Ultra-Low-Power ADAS on DiDi's JueShi Devices for Reducing Traffic Accidents

New Oriental Technology

Nov 9, 2020 · Artificial Intelligence

Understanding YOLOv4 and YOLOv5: Core Elements and Innovations in Object Detection

This article introduces the fundamentals of object detection, explains the latest YOLOv4 and YOLOv5 architectures, and details the essential components—including data preparation, regularization, backbone, neck, and prediction innovations—along with label smoothing and advanced loss functions for improved detection performance.

AIComputer VisionYOLOv4

0 likes · 9 min read

Understanding YOLOv4 and YOLOv5: Core Elements and Innovations in Object Detection

21CTO

Nov 3, 2020 · Artificial Intelligence

How Does Image Recognition Work? A Simple Guide to Core Principles

This article explains the fundamental principles of image recognition, covering how images are converted to numeric arrays, processed by scanning matrix blocks, and matched against patterns to identify objects such as text, faces, cats, dogs, or mice.

AI basicsComputer VisionConvolution

0 likes · 4 min read

How Does Image Recognition Work? A Simple Guide to Core Principles

iQIYI Technical Product Team

Oct 16, 2020 · Artificial Intelligence

Cartoon Face Recognition: Introducing the iCartoonFace Benchmark Dataset

iQIYI’s ACM Multimedia‑accepted paper unveils iCartoonFace, the world’s largest manually annotated cartoon‑face dataset—over 5,000 characters and 400,000 real‑scene images—accompanied by a semi‑automatic collection pipeline and multi‑person training framework, now powering AI services, large‑scale contests and accelerating cartoon‑character recognition research.

Artificial IntelligenceCartoon Face RecognitionComputer Vision

0 likes · 4 min read

Cartoon Face Recognition: Introducing the iCartoonFace Benchmark Dataset

Didi Tech

Oct 16, 2020 · Artificial Intelligence

Mask Detection System and Visual AI Competition Achievements

Didi’s COVID‑19 mask‑detection system, built on a DFS‑based face detector and an attention‑enhanced ResNet‑50 mask classifier achieving over 99.5 % accuracy, has been deployed in vehicles, open‑sourced, and complemented by top‑ranked results in international visual AI contests, including first place in driver‑gaze prediction and podium finishes in emotion recognition and model‑compression challenges.

AIComputer VisionDeep Learning

0 likes · 22 min read

Mask Detection System and Visual AI Competition Achievements

Kuaishou Large Model

Oct 15, 2020 · Artificial Intelligence

How Kuaishou’s Y‑Tech Advances Monocular Depth Estimation for Mobile AR

This article reviews Kuashou Y‑Tech’s ECCV‑2020 paper on monocular depth estimation, detailing its novel GCB‑SAB network, new HC‑Depth dataset, specialized loss functions and edge‑aware training, and demonstrates superior performance on NYUv2, TUM and real‑world mobile AR applications.

Attention MechanismComputer VisionDeep Learning

0 likes · 14 min read

How Kuaishou’s Y‑Tech Advances Monocular Depth Estimation for Mobile AR

360 Quality & Efficiency

Sep 18, 2020 · Artificial Intelligence

Data Augmentation Techniques for Improving Object Detection Model Robustness

To enhance object detection robustness, the article discusses various data augmentation methods—including rotation, flipping, random cropping, scaling, color jitter, blurring, transparency adjustment, and image partitioning—providing code examples and illustrating their impact on model performance with before‑and‑after results.

Computer VisionPythondata augmentation

0 likes · 7 min read

Data Augmentation Techniques for Improving Object Detection Model Robustness

Suning Technology

Sep 17, 2020 · Artificial Intelligence

How SuNing’s Fourth‑Gen Digital Visual Unmanned Store Redefines AI‑Powered Retail

SuNing’s fourth‑generation fully digital visual unmanned store combines 3D reconstruction, AI‑driven perception, and modular hardware‑software design to achieve real‑time, all‑scene, all‑time, all‑digital analysis of people, goods, and spaces, enabling precise offline marketing and scalable retail digitization.

AIComputer VisionDigital Twin

0 likes · 27 min read

How SuNing’s Fourth‑Gen Digital Visual Unmanned Store Redefines AI‑Powered Retail

Suning Technology

Sep 3, 2020 · Artificial Intelligence

How Suning’s Fourth‑Gen AI‑Powered Visual Unmanned Stores Transform Retail

Suning’s lecture series details the three‑decade evolution of retail, the company’s 30‑year digital transformation, and the technical architecture of its fourth‑generation fully digital visual unmanned stores that leverage AI, computer vision, and big‑data analytics to revolutionize in‑store operations and customer experience.

AIComputer VisionDigital Store

0 likes · 14 min read

How Suning’s Fourth‑Gen AI‑Powered Visual Unmanned Stores Transform Retail

Zhengtong Technical Team

Aug 14, 2020 · Artificial Intelligence

ZTFace: A High‑Precision, Fast Face Recognition Algorithm

This article presents ZTFace, an end‑to‑end face recognition solution that integrates face detection, alignment, feature embedding, verification, anti‑spoofing and attribute recognition using deep learning, details its backbone networks, loss functions, training datasets, experimental results on WIDER FACE and LFW, and demonstrates acceleration with TensorRT.

Computer VisionTensorRTZTFace

0 likes · 17 min read

ZTFace: A High‑Precision, Fast Face Recognition Algorithm

360 Tech Engineering

Aug 7, 2020 · Artificial Intelligence

Guide to Image Matching: Template Matching, Feature Matching with SIFT and FLANN, and Homography

This guide explains image matching techniques, covering template matching with OpenCV, various matching methods, SIFT feature extraction and description, FLANN-based nearest neighbor matching, homography estimation, practical challenges, and a brief overview of YOLO training, providing code examples and visual illustrations.

Computer VisionFLANNFeature Matching

0 likes · 15 min read

Guide to Image Matching: Template Matching, Feature Matching with SIFT and FLANN, and Homography

iQIYI Technical Product Team

Aug 7, 2020 · Artificial Intelligence

Boundary Content Graph Neural Network (BC‑GNN) for Temporal Action Proposal Generation

The Boundary Content Graph Neural Network (BC‑GNN) introduces a bipartite‑graph framework that jointly refines start/end boundary probabilities and segment‑content confidence, enabling more precise temporal action proposals and achieving state‑of‑the‑art results on ActivityNet‑1.3 and THUMOS14.

BC-GNNComputer VisionDeep Learning

0 likes · 10 min read

Boundary Content Graph Neural Network (BC‑GNN) for Temporal Action Proposal Generation

Amap Tech

Jul 30, 2020 · Artificial Intelligence

Evolution and Practice of Scene Text Recognition Technology in Amap Map Data Production

Amap uses advanced scene text recognition combining detection and recognition modules, deep learning, data synthesis, and result fusion to automate map data production, achieving state-of-the-art performance and automating the majority of POI and road updates, significantly reducing labor costs.

Computer VisionDeep LearningOCR

0 likes · 18 min read

Evolution and Practice of Scene Text Recognition Technology in Amap Map Data Production

Alibaba Cloud Developer

Jul 30, 2020 · Artificial Intelligence

How Amap’s Scene Text Recognition Powers Accurate Maps: Evolution and Future Challenges

This article explains how Amap leverages scene text recognition to automate map data production, detailing the evolution from traditional image algorithms to deep‑learning models, the current detection and recognition framework, performance results, and future research directions for handling blur, data scarcity, and semantic understanding.

AmapComputer VisionDeep Learning

0 likes · 19 min read

How Amap’s Scene Text Recognition Powers Accurate Maps: Evolution and Future Challenges

Alibaba Cloud Developer

Jul 29, 2020 · Artificial Intelligence

How Gaode Maps Boosts Accuracy with Advanced Scene Text Recognition

This article explains how Gaode Maps leverages traditional and deep‑learning based scene text recognition techniques—including character detection, sequence models, data synthesis, and multi‑stage frameworks—to automate POI and road data production with high precision and speed.

Computer VisionDeep LearningOCR

0 likes · 20 min read

How Gaode Maps Boosts Accuracy with Advanced Scene Text Recognition

Youku Technology

Jul 29, 2020 · Artificial Intelligence

Core Technology of Video Content Understanding: Technical Practice of Partial Re-ID in Video Inspection

The talk explains how Alibaba’s Entertainment Content Operation Platform applies a Partial‑ReID algorithm to overcome the challenges of person re‑identification in heavily edited video content, enabling accurate cross‑shot character matching, richer appearance data, and metrics such as presence, interaction, and storyline for improved video quality assessment.

AIComputer VisionPartial Re-ID

0 likes · 2 min read

Core Technology of Video Content Understanding: Technical Practice of Partial Re-ID in Video Inspection

NetEase Media Technology Team

Jul 24, 2020 · Artificial Intelligence

Survey of Video Action Recognition Algorithms: 3D and 2D Convolutional Networks and Pre‑training

This survey reviews video action recognition, comparing 3D convolutional networks that jointly model spatial‑temporal cues but are computationally heavy with 2D‑based approaches like TSM and TIN that embed temporal shifts efficiently, and emphasizes how large‑scale pre‑training markedly improves performance despite limited labeled data.

2D convolutional networks3D convolutional networksComputer Vision

0 likes · 13 min read

Survey of Video Action Recognition Algorithms: 3D and 2D Convolutional Networks and Pre‑training

Sohu Tech Products

Jul 22, 2020 · Artificial Intelligence

Face Detection Using Haar Features and AdaBoost with OpenCV

This article explains the principles and implementation of face detection based on statistical methods, detailing Haar feature types, integral image computation, feature normalization, cascade classifiers, and provides step‑by‑step OpenCV code examples for static images, eye detection, and real‑time webcam detection.

AdaBoostComputer VisionFace Detection

0 likes · 19 min read

Face Detection Using Haar Features and AdaBoost with OpenCV

Alibaba Cloud Developer

Jul 13, 2020 · Artificial Intelligence

Master Dynamic Road Condition Analysis with Car Video – AMAP-TECH Competition Overview

The AMAP-TECH algorithm competition invites participants to develop AI models that analyze in-vehicle video sequences to determine dynamic road conditions, offering detailed dataset specifications, evaluation metrics, expert judges, schedule, and prize information for researchers in computer vision and traffic analytics.

AIComputer VisionDataset

0 likes · 9 min read

Master Dynamic Road Condition Analysis with Car Video – AMAP-TECH Competition Overview

Youku Technology

Jul 10, 2020 · Artificial Intelligence

Mastering Video Object Segmentation: Cutting-Edge Models and Design Tricks

This technical talk introduces video object segmentation tasks, reviews leading datasets and state-of-the-art deep learning models, and shares practical network design rules and performance‑boosting techniques, presented by Prof. Wang Xinggang as part of Alibaba's MEDIA AI challenge series.

AIComputer VisionDeep Learning

0 likes · 4 min read

Mastering Video Object Segmentation: Cutting-Edge Models and Design Tricks

Amap Tech

Jul 9, 2020 · Artificial Intelligence

AMAP-TECH Algorithm Competition: Dynamic Road‑Condition Analysis from In‑Vehicle Video Images

Alibaba Amap’s AMAP‑TECH competition invites participants to develop AI computer‑vision models that classify real‑time road conditions—smooth, slow, or congested—from short sequences of dash‑cam images, using a labeled dataset of 1,500 training sequences and a weighted F1‑score evaluation, with cash prizes up to ¥60,000.

AIComputer VisionDataset

0 likes · 8 min read

AMAP-TECH Algorithm Competition: Dynamic Road‑Condition Analysis from In‑Vehicle Video Images

Alibaba Cloud Developer

Jul 3, 2020 · Artificial Intelligence

Unlocking Visual Object Tracking: Principles, Algorithms, and Evaluation

This comprehensive review explains visual object tracking in computer vision, covering its definition, core sub‑problems of candidate generation, feature extraction, and decision making, system architecture, motion, feature and observation models, algorithm classifications, evaluation metrics, datasets, and recent research trends.

Computer VisionDeep Learningevaluation metrics

0 likes · 30 min read

Unlocking Visual Object Tracking: Principles, Algorithms, and Evaluation

Youku Technology

Jun 19, 2020 · Artificial Intelligence

Video-based Temporal Event Detection Methods

In the fourth Alibaba Digital Media Technology Night Talk, algorithm engineer Liu Xiaolong presents an overview of video‑based temporal event detection, covering its problem background, representative prior works, and the latest research advances within the MEDIA AI Algorithm Challenge series.

AlibabaArtificial IntelligenceComputer Vision

0 likes · 1 min read

Video-based Temporal Event Detection Methods

TAL Education Technology

Jun 18, 2020 · Artificial Intelligence

An Overview of Virtual Reality, Augmented Reality, and Vision‑Based Techniques

This article explains the fundamentals of virtual reality and its distinction from augmented reality, describes VR hardware, outlines depth‑estimation and eye‑tracking methods such as projection, Hough transform, AdaBoost and sample matching, discusses Sobel edge detection, and explores the importance of audio, haptic feedback, and immersive VR applications in education.

ARComputer VisionDepth estimation

0 likes · 11 min read

An Overview of Virtual Reality, Augmented Reality, and Vision‑Based Techniques

360 Quality & Efficiency

May 29, 2020 · Artificial Intelligence

Image Matching Techniques: Template Matching, Feature Matching, SIFT, FLANN, and Homography

This article introduces image matching fundamentals, covering template matching methods, feature-based approaches such as SIFT and FLANN, their implementation details, matching rules, homography transformation, and practical considerations, providing a comprehensive overview for computer vision applications.

Computer VisionFLANNFeature Matching

0 likes · 14 min read

Image Matching Techniques: Template Matching, Feature Matching, SIFT, FLANN, and Homography

JD Retail Technology

May 27, 2020 · Artificial Intelligence

JD ARVR Tech Department Publishes Two Papers on Defocus Blur Detection and Few-Shot Learning in Top Venues

The JD ARVR technology department announced two peer‑reviewed papers—one on a novel defocus blur detection network published in Transaction on Multimedia and another on a transductive relation‑propagation network for few‑shot learning accepted at IJCAI 2020—highlighting their advanced AI research and future AR‑VR ecosystem plans.

ARVRComputer VisionDeep Learning

0 likes · 7 min read

JD ARVR Tech Department Publishes Two Papers on Defocus Blur Detection and Few-Shot Learning in Top Venues

Amap Tech

May 25, 2020 · Artificial Intelligence

Automated Production Line for Base Map Data Using Image AI and Data Fusion

Gaode’s automated production line combines deep‑learning image recognition, GPS‑enhanced location services, image differencing with semantic filtering, and standardized data‑fusion to continuously refresh China’s national base map, cutting manual effort and costs while delivering real‑time, high‑quality map updates for road traffic infrastructure.

Computer VisionDeep Learningdata fusion

0 likes · 11 min read

Automated Production Line for Base Map Data Using Image AI and Data Fusion

ITPUB

May 14, 2020 · Artificial Intelligence

Cut & Paste Real Objects into Photoshop with AR in Under 10 Seconds

This article explains the AR Cut & Paste prototype by Cyril Diagne, detailing its three‑module architecture, the underlying BASNet and U²‑Net vision models, and provides a step‑by‑step guide—including code snippets and GitHub links—to set up the mobile app, local server, and Photoshop integration.

ARBASNetComputer Vision

0 likes · 8 min read

Cut & Paste Real Objects into Photoshop with AR in Under 10 Seconds

Python Programming Learning Circle

May 12, 2020 · Artificial Intelligence

Batch Image Segmentation with Python and PaddlePaddle

This tutorial demonstrates how to use Python and the PaddlePaddle deep‑learning platform to automatically remove backgrounds from multiple photos in one step, covering installation, verification, and a concise five‑line code example for batch human segmentation.

Batch ProcessingComputer VisionDeep Learning

0 likes · 6 min read

Batch Image Segmentation with Python and PaddlePaddle

Programmer DD

May 9, 2020 · Artificial Intelligence

ChineseOCR Lite: Ultra‑Lightweight OCR Engine for Vertical Chinese Text

ChineseOCR Lite is an open‑source, ultra‑lightweight OCR solution that supports vertical Chinese text, runs on Linux/macOS via ncnn inference, and packs detection, recognition, and angle classification models into a total of just 17 MB, offering fast and accurate scene‑text processing.

Chinese OCRComputer VisionOCR

0 likes · 4 min read

ChineseOCR Lite: Ultra‑Lightweight OCR Engine for Vertical Chinese Text

Didi Tech

Apr 30, 2020 · Artificial Intelligence

DGF-M: Face Recognition Algorithm for Masked Face Scenarios

Didi’s DGF‑M model, a mask‑aware face‑recognition AI, combines multi‑task training and synthetic data to detect masks with under 0.1 % miss rate and verify identities with up to 99.5 % pass rate at a 0.1 % false‑acceptance rate, and is deployed for driver verification, offered through the Didi Cloud API marketplace, and released as an open‑source solution to aid pandemic‑era security.

AI algorithmComputer VisionDGF-M

0 likes · 5 min read

DGF-M: Face Recognition Algorithm for Masked Face Scenarios

Amap Tech

Apr 24, 2020 · Artificial Intelligence

Q&A on Computer Vision Technologies and Their Applications in Mapping, Navigation, and Autonomous Driving

In a live Q&A, Alibaba Amap’s chief scientist Ren Xiaofeng explained how computer‑vision algorithms underpin high‑precision map creation, AR navigation, visual localization and sensor fusion, discussed current hardware limits, deep‑learning bottlenecks, 5G’s role, edge‑cloud cooperation, and offered career advice for transitioning researchers.

AIAR navigationComputer Vision

0 likes · 14 min read

Q&A on Computer Vision Technologies and Their Applications in Mapping, Navigation, and Autonomous Driving

Programmer DD

Apr 17, 2020 · Artificial Intelligence

How to Make People Vanish in Real‑Time Using TensorFlow.js and MobileNet

Jason Mayes, a Google web engineer, open‑sourced a TensorFlow.js demo that removes people from live webcam video in real time using a lightweight MobileNet model, with only about 200 lines of code, and provides GitHub and CodePen links for experimentation.

Computer VisionMobileNetReal-time Video

0 likes · 9 min read

How to Make People Vanish in Real‑Time Using TensorFlow.js and MobileNet

iQIYI Technical Product Team

Apr 3, 2020 · Artificial Intelligence

iCartoonFace Challenge: Cartoon Face Detection and Recognition Competition

The iCartoonFace Challenge invites participants to develop efficient algorithms for detecting and recognizing cartoon faces using large, meticulously annotated datasets—50,000 images for detection and nearly 390,000 for recognition—while meeting strict model size and latency limits and submitting detailed methods and code.

AI competitionCartoon Face RecognitionComputer Vision

0 likes · 6 min read

iCartoonFace Challenge: Cartoon Face Detection and Recognition Competition

JD Retail Technology

Apr 2, 2020 · Artificial Intelligence

How Deep Learning Powers Text Detection in E‑commerce Posters

This article surveys state‑of‑the‑art deep‑learning techniques for scene text detection and recognition in e‑commerce poster images, detailing models such as CTPN, TextBoxes, SegLink, EAST, and end‑to‑end frameworks, and discusses their architectures, strengths, limitations, and future challenges.

Computer VisionDeep Learninge‑commerce

0 likes · 16 min read

How Deep Learning Powers Text Detection in E‑commerce Posters