Tagged articles
650 articles
Page 5 of 7
Miss Fresh Tech Team
Miss Fresh Tech Team
Jul 8, 2021 · Artificial Intelligence

How AI Powers Smart Vending Cabinets: From RFID to Deep Learning Detection

This article details the evolution of intelligent vending cabinets, comparing RFID, gravity, dynamic and static vision solutions, and explains how deep‑learning models, data pipelines, and system architectures enable high‑accuracy, low‑loss product detection and automated operations in modern unmanned retail.

AIComputer VisionNeural Networks
0 likes · 36 min read
How AI Powers Smart Vending Cabinets: From RFID to Deep Learning Detection
New Oriental Technology
New Oriental Technology
Jul 8, 2021 · Artificial Intelligence

Paper Detection and Perspective Correction Using OpenCV.js

This article introduces OpenCV.js, explains its basic concepts and demonstrates a complete workflow for detecting and correcting paper images in the browser using JavaScript, including matrix handling, resizing, filtering, edge detection, contour analysis, perspective transformation, and discusses challenges such as noise and incomplete edges.

Computer VisionImage ProcessingJavaScript
0 likes · 10 min read
Paper Detection and Perspective Correction Using OpenCV.js
TiPaiPai Technical Team
TiPaiPai Technical Team
Jul 2, 2021 · Artificial Intelligence

How ContourNet and CenterNet Revolutionize Text Detection

This article explains the challenges of scene text detection and introduces two state‑of‑the‑art models, ContourNet and CenterNet, detailing their architectural innovations, loss functions, and how they overcome issues like extreme aspect ratios and anchor‑based inefficiencies.

CenterNetComputer VisionContourNet
0 likes · 7 min read
How ContourNet and CenterNet Revolutionize Text Detection
TiPaiPai Technical Team
TiPaiPai Technical Team
Jun 28, 2021 · Artificial Intelligence

How Deep Learning Unwarps Twisted Document Images: DocUNet & DewarpNet Explained

This article reviews two end‑to‑end deep‑learning approaches—DocUNet (CVPR 2018) and DewarpNet (ICCV 2019)—for correcting warped document images, detailing their network architectures, synthetic data generation, loss functions, experimental results, and the remaining challenges in document dewarping.

Computer VisionDeep LearningImage Processing
0 likes · 14 min read
How Deep Learning Unwarps Twisted Document Images: DocUNet & DewarpNet Explained
TAL Education Technology
TAL Education Technology
Jun 24, 2021 · Artificial Intelligence

GoodFuture AI Institute Wins Four International Championships at CVPR 2021 Across Multiple Vision Challenges

GoodFuture AI Institute secured four international titles at CVPR 2021—including Person In Context, UG²+, ETH‑XGaze, and ActivityNet—showcasing world‑class computer‑vision algorithms for human‑object interaction, low‑light face detection, gaze estimation, and active speaker detection, and highlighting their deployment in educational AI solutions.

AI competitionActive Speaker DetectionCVPR
0 likes · 9 min read
GoodFuture AI Institute Wins Four International Championships at CVPR 2021 Across Multiple Vision Challenges
Alimama Tech
Alimama Tech
Jun 24, 2021 · Artificial Intelligence

One‑Stage Training for Generative Adversarial Networks (OSGAN): Methodology and Efficiency Analysis

The OSGAN method introduced by Alibaba’s Mama team and Prof. Song Ming‑Li merges generator and discriminator updates into a single stage, cutting GAN training time by roughly 1.5‑1.7× while maintaining performance, and is validated on symmetric and asymmetric DCGANs with open‑source code.

Computer VisionDeep LearningGAN
0 likes · 10 min read
One‑Stage Training for Generative Adversarial Networks (OSGAN): Methodology and Efficiency Analysis
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 22, 2021 · Artificial Intelligence

Turning Parking Cameras into AI‑Powered Safety Guardians

A Qingdao University student leveraged Intel Xeon SG1 GPU, OpenVINO and Mask R-CNN to transform existing parking‑lot cameras into an intelligent system that counts vehicles, detects pedestrians in blind spots, and issues real‑time safety alerts, showcasing a practical AI solution for child safety in crowded parking areas.

AIComputer VisionIntel Xeon
0 likes · 5 min read
Turning Parking Cameras into AI‑Powered Safety Guardians
Xianyu Technology
Xianyu Technology
Jun 9, 2021 · Artificial Intelligence

Applying Visual AI Techniques for Image Quality and Duplicate Detection in Xianyu Marketplace

By deploying large‑scale visual AI—including a ResNet‑101 classifier, ArcFace‑trained matching features, clustering‑based sub‑category refinement, and product‑level image indexing—Xianyu’s marketplace dramatically improves image quality, removes duplicates, enhances search relevance and feed diversity, and filters non‑compliant content.

Computer VisionDeep LearningImage Classification
0 likes · 16 min read
Applying Visual AI Techniques for Image Quality and Duplicate Detection in Xianyu Marketplace
Amap Tech
Amap Tech
Jun 4, 2021 · Artificial Intelligence

Deploying Multiple CNN Models on Low‑End Devices with MNN: Memory Tricks and Error Debugging

This article explains how a high‑traffic map service captures road features using client‑side computer‑vision models, details the deployment of many CNNs with the lightweight MNN engine on memory‑constrained devices, and shares practical memory‑saving techniques, inference scheduling, and error‑analysis methods.

AndroidComputer VisionMNN
0 likes · 12 min read
Deploying Multiple CNN Models on Low‑End Devices with MNN: Memory Tricks and Error Debugging
Meituan Technology Team
Meituan Technology Team
Jun 3, 2021 · Artificial Intelligence

LargeFineFoodAI Workshop and Challenge at ICCV 2021

At ICCV 2021 in Montreal, the LargeFineFoodAI workshop—co‑organized by Meituan Vision Intelligence Center, the Chinese Academy of Sciences, Beijing Zhiyuan and the University of Barcelona—will showcase state‑of‑the‑art fine‑grained food image research, feature invited speakers Jain, Aizawa and Radeva, and host a $12,000 prize challenge on Food2K across recognition and retrieval tracks.

ChallengeComputer VisionDataset
0 likes · 7 min read
LargeFineFoodAI Workshop and Challenge at ICCV 2021
Meituan Technology Team
Meituan Technology Team
May 27, 2021 · Artificial Intelligence

Standardizing Food Delivery Dish Names: Knowledge Graph Construction and Applications

The paper outlines an end‑to‑end pipeline that standardizes highly personalized food‑delivery dish names by combining rule‑based and BERT‑DSSM text synonym detection with EfficientNet image classification, constructing a multi‑level taxonomy that improves aggregation, supply‑demand analysis, recall ranking and merchant tagging.

Computer VisionNLPentity extraction
0 likes · 17 min read
Standardizing Food Delivery Dish Names: Knowledge Graph Construction and Applications
Tencent Advertising Technology
Tencent Advertising Technology
May 27, 2021 · Artificial Intelligence

Multimodal Video Ad Second-Level Parsing: Algorithm Design and Baseline Analysis for the 2021 Tencent Advertising Algorithm Competition

This article details the algorithmic framework and baseline models for the 2021 Tencent Advertising Algorithm Competition, focusing on multimodal video ad parsing through temporal localization, scene segmentation, and multi-label classification to enhance advertising effectiveness and creative analysis.

Computer VisionTemporal Segmentationadvertising technology
0 likes · 22 min read
Multimodal Video Ad Second-Level Parsing: Algorithm Design and Baseline Analysis for the 2021 Tencent Advertising Algorithm Competition
Kuaishou Tech
Kuaishou Tech
May 24, 2021 · Artificial Intelligence

BCNet: A Bilayer Instance Segmentation Network for Occlusion‑Aware Object Detection

The paper proposes BCNet, a lightweight bilayer instance segmentation network that explicitly models occluder and occludee relationships by treating each region of interest as two overlapping layers, achieving significant performance gains on COCO, COCOA and KINS datasets under heavy occlusion.

Computer VisionDeep Learningbilayer network
0 likes · 10 min read
BCNet: A Bilayer Instance Segmentation Network for Occlusion‑Aware Object Detection
Alimama Tech
Alimama Tech
May 20, 2021 · Artificial Intelligence

How Alibaba’s AI Powers Brand Risk Detection: Models, Data, and Results

This article details Alibaba's AliMama brand risk identification system, covering the challenges of counterfeit detection, the construction of large‑scale brand datasets, the design of classification, logo detection, and variation models, their optimization, evaluation metrics, and future directions for AI‑driven brand protection.

AIAlibabaComputer Vision
0 likes · 22 min read
How Alibaba’s AI Powers Brand Risk Detection: Models, Data, and Results
Kuaishou Large Model
Kuaishou Large Model
May 13, 2021 · Artificial Intelligence

How Regressive Domain Adaptation Boosts Unsupervised Keypoint Detection

This article reviews the CVPR2021 paper on Regressive Domain Adaptation (RegDA) for unsupervised keypoint detection, explaining its motivation, novel adversarial regression framework, sparse output-space modeling, min‑min training strategy, extensive experiments, and the resulting performance gains across multiple datasets.

Computer VisionUnsupervised Learningdomain adaptation
0 likes · 13 min read
How Regressive Domain Adaptation Boosts Unsupervised Keypoint Detection
Kuaishou Tech
Kuaishou Tech
May 10, 2021 · Artificial Intelligence

Semantic Image Matting: Integrating Alpha Pattern Semantics into the Matting Framework

The article presents Semantic Image Matting, a novel approach that incorporates 20 semantic Alpha pattern categories into the matting pipeline via semantic Trimap, region‑based classifiers, multi‑class discriminators, and learnable gradient loss, achieving state‑of‑the‑art results on multiple benchmarks.

Computer VisionDeep Learningalpha patterns
0 likes · 11 min read
Semantic Image Matting: Integrating Alpha Pattern Semantics into the Matting Framework
JD Cloud Developers
JD Cloud Developers
Apr 30, 2021 · Artificial Intelligence

How Face Keypoint Localization Advances Under Masked Conditions: Insights from JD AI’s 3rd Competition

The JD AI Institute and ICME2021 concluded their third face keypoint localization contest, emphasizing efficient masked‑face detection to aid COVID‑19 contact tracing, attracting top universities and tech firms, expanding data scale, and tightening model efficiency constraints to push the field forward.

AI competitionComputer VisionDeep Learning
0 likes · 4 min read
How Face Keypoint Localization Advances Under Masked Conditions: Insights from JD AI’s 3rd Competition
Amap Tech
Amap Tech
Apr 23, 2021 · Artificial Intelligence

Design Principles and Implementation of Gaode AR Navigation

The article explains Gaode Maps’ AR navigation design, detailing how environmental factors, spatial experience, color hierarchy, safety considerations, and competitor insights shape a six‑point design framework, and describes prototype testing, implementation strategies for overlapping alerts, and future prospects such as virtual road barriers and multimodal travel.

AR navigationComputer VisionUser experience
0 likes · 8 min read
Design Principles and Implementation of Gaode AR Navigation
360 Quality & Efficiency
360 Quality & Efficiency
Apr 16, 2021 · Artificial Intelligence

Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames

This article presents a method that replaces manual visual inspection with an automated YOLOv5‑based object detection pipeline to classify video frames as normal, colorful, or black screens, detailing data annotation, training, loss calculation, inference code, and showing a 97% accuracy improvement over ResNet.

Computer VisionDeep LearningImage Classification
0 likes · 11 min read
Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames
MaGe Linux Operations
MaGe Linux Operations
Apr 13, 2021 · Artificial Intelligence

Top 10 Free Python Libraries for Image Processing You Should Try

Discover ten essential, free Python libraries for image processing—from scikit-image and NumPy to OpenCV-Python and Pycairo—each with resources, usage examples, and visual demonstrations, enabling you to manipulate, analyze, and transform images efficiently for computer vision and data science projects.

Computer VisionImage ProcessingOpenCV
0 likes · 12 min read
Top 10 Free Python Libraries for Image Processing You Should Try
58UXD
58UXD
Apr 12, 2021 · Artificial Intelligence

How 58.com Built an AI Designer: From Smart Cutout to Intelligent Creative Platform

This article chronicles 58.com’s journey from a small brainstorming room to a full‑scale AI design platform, detailing the development of smart cutout, the BASNet segmentation model, custom loss functions, template editing, and the measurable business impact of the AI designer.

AI designBASNetComputer Vision
0 likes · 15 min read
How 58.com Built an AI Designer: From Smart Cutout to Intelligent Creative Platform
DataFunTalk
DataFunTalk
Apr 10, 2021 · Artificial Intelligence

2020 Computer Vision Breakthroughs: Self‑Supervised Learning, Transformer Attention Modeling, and Neural Radiance Fields

The talk reviews three major 2020 advances in computer vision—self‑supervised learning surpassing supervised pre‑training, the successful adoption of Transformer‑based attention models for detection and classification, and the emergence of Neural Radiance Fields for view synthesis—while highlighting related research from Microsoft Research Asia and the broader community.

2020AI breakthroughsComputer Vision
0 likes · 19 min read
2020 Computer Vision Breakthroughs: Self‑Supervised Learning, Transformer Attention Modeling, and Neural Radiance Fields
Youku Technology
Youku Technology
Apr 8, 2021 · Artificial Intelligence

Champion Solution of Media AI Alibaba Entertainment Video Object Segmentation Challenge

The Youku AI team won the Media AI Alibaba Entertainment Video Object Segmentation Challenge by enhancing the STM model with a spatial‑constrained memory reader, ASPP‑HRNet refinement, ResNeSt‑101 backbone, and a multi‑stage training pipeline, while also devising an unsupervised framework that combines DetectoRS detection, HRNet mask refinement, STM‑based association, and key‑frame optimization to achieve 95.5% test score on a large, richly annotated video dataset.

Computer VisionDeep LearningSemi-supervised Learning
0 likes · 13 min read
Champion Solution of Media AI Alibaba Entertainment Video Object Segmentation Challenge
Kuaishou Tech
Kuaishou Tech
Apr 6, 2021 · Artificial Intelligence

Frequency-Aware Feature Learning with Single-Center Loss for Face Forgery Detection

Researchers from USTC and Kuaishou propose a frequency‑aware feature learning framework that combines a data‑driven adaptive frequency module with a novel single‑center loss, achieving state‑of‑the‑art performance on deepfake detection while addressing class‑distribution challenges.

AI securityComputer Visiondeepfake detection
0 likes · 7 min read
Frequency-Aware Feature Learning with Single-Center Loss for Face Forgery Detection
Kuaishou Large Model
Kuaishou Large Model
Apr 1, 2021 · Artificial Intelligence

How Kuaishou Y‑Tech Leverages GANs for Real‑Time Face Attribute Editing in Short Videos

This article details Kuaishou Y‑Tech's practical deployment of GAN‑based high‑precision face attribute editing—covering gender, age, hair, and expression transformations—for short‑video effects, discussing background, business applications, technical challenges, and solutions across data preparation, model training, and mobile deployment.

Computer VisionGANKuaishou
0 likes · 15 min read
How Kuaishou Y‑Tech Leverages GANs for Real‑Time Face Attribute Editing in Short Videos
iQIYI Technical Product Team
iQIYI Technical Product Team
Mar 26, 2021 · Artificial Intelligence

Insights into OCR Technology at iQIYI: Development, Challenges, and Applications

iQIYI’s OCR journey, explained by researcher Harlon, covers the evolution from separate detection and recognition pipelines to end‑to‑end models, key algorithms like CTPN, DB and CRNN, large‑scale simulated training, diverse video‑text applications, and future goals such as mobile deployment and tighter NLP integration.

AIComputer VisionDeep Learning
0 likes · 21 min read
Insights into OCR Technology at iQIYI: Development, Challenges, and Applications
58 Tech
58 Tech
Mar 24, 2021 · Artificial Intelligence

Automated Detection of Illegal Watermarks in Images Using Deep Learning at 58.com

This article describes how 58.com built an end‑to‑end deep‑learning watermark detection service, covering business needs, data collection and augmentation, model selection and iterative improvements (Faster‑RCNN, SSD, YOLOv3, anchor‑free methods), deployment results, and future research directions.

Computer VisionImage ModerationModel Optimization
0 likes · 14 min read
Automated Detection of Illegal Watermarks in Images Using Deep Learning at 58.com
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Mar 23, 2021 · Artificial Intelligence

How to Recognize Credit Card Numbers with OpenCV: A Step‑by‑Step Tutorial

This tutorial walks through a project‑based OpenCV workflow that reads a digit template, preprocesses both template and credit‑card images, extracts individual numbers, matches them against the template, and finally overlays the recognized digits onto the original image, illustrating core computer‑vision techniques.

Computer VisionImage ProcessingOCR
0 likes · 10 min read
How to Recognize Credit Card Numbers with OpenCV: A Step‑by‑Step Tutorial
Amap Tech
Amap Tech
Mar 22, 2021 · Artificial Intelligence

Visual Technology for Automated POI Name Generation: STR, Text Detection, and Naming Practices

Amap’s visual‑technology pipeline automatically generates and updates POI names by crowdsourcing street‑level images, applying deep‑learning scene‑text recognition, dual‑branch classification of text attributes, and a BERT‑plus‑graph‑attention model that selects and orders recognized text, achieving about 95 % naming accuracy.

Computer VisionDeep LearningName Generation
0 likes · 14 min read
Visual Technology for Automated POI Name Generation: STR, Text Detection, and Naming Practices
Tencent Cloud Developer
Tencent Cloud Developer
Mar 4, 2021 · Artificial Intelligence

WeChat OCR: Implementation of Image Text Extraction Feature

WeChat’s 8.0 update introduced an OCR pipeline that first quickly detects text in images, classifies the image type, applies a lightweight multi‑language detection network and a MobileNetV3‑based DBNet recognizer with a multi‑task CTC/Attention model, then merges results via a rule‑based layout analyzer to deliver accurate, well‑formatted extracted text across diverse languages and document types.

Computer VisionDBNetDeep Learning
0 likes · 13 min read
WeChat OCR: Implementation of Image Text Extraction Feature
Laravel Tech Community
Laravel Tech Community
Feb 28, 2021 · Artificial Intelligence

How the “Ant Ya Hey” AI Effect Works and How to Create It

This article explains the popular Douyin AI effect “Ant Ya Hey”, showcases celebrity demos, provides a step‑by‑step guide using Avatarify and video editors, and delves into the underlying First‑Order Motion Model research that powers the realistic facial animation.

AIAvatarifyComputer Vision
0 likes · 6 min read
How the “Ant Ya Hey” AI Effect Works and How to Create It
Kuaishou Large Model
Kuaishou Large Model
Feb 25, 2021 · Artificial Intelligence

How Kuaishou’s AI‑Powered Beauty Engine Transforms Real‑Time Video

This article details Kuaishou Y‑tech’s Gorgeous beauty platform, covering traditional smoothing, advanced skin‑tone effects, AI‑driven blemish removal, clarity enhancement, local facial tuning, and the UNet‑based GorgeousGAN that delivers one‑click high‑definition beauty for live‑stream and short‑video applications.

AI beautyComputer VisionDeep Learning
0 likes · 13 min read
How Kuaishou’s AI‑Powered Beauty Engine Transforms Real‑Time Video
360 Tech Engineering
360 Tech Engineering
Feb 23, 2021 · Artificial Intelligence

Video Stutter Detection via Frame Difference Analysis Using FFmpeg

This article explains a method for detecting video stutter by converting uploaded videos into frame sequences with ffmpeg, calculating pixel differences between consecutive frames, aggregating motion metrics, removing scene‑change effects, computing a dynamic factor, and outputting a binary result indicating the presence or absence of stutter.

Computer VisionVideo processingalgorithm
0 likes · 5 min read
Video Stutter Detection via Frame Difference Analysis Using FFmpeg
DataFunTalk
DataFunTalk
Feb 16, 2021 · Artificial Intelligence

Multimedia Content Understanding in Meitu Community: Video Classification, Fingerprinting, and OCR

This article presents Meitu Community's AI‑driven multimedia content analysis pipeline, covering short‑video classification, video fingerprinting, and OCR, detailing model choices, experimental results, and future directions for improving content audit, quality, tagging, and feature engineering.

AIComputer VisionFingerprinting
0 likes · 18 min read
Multimedia Content Understanding in Meitu Community: Video Classification, Fingerprinting, and OCR
JD Cloud Developers
JD Cloud Developers
Feb 10, 2021 · Artificial Intelligence

How JD Tech’s Breakthrough AI Papers Dominated AAAI 2021

JD Tech showcased a remarkable 21-paper presence at AAAI 2021, covering federated learning, spatio‑temporal AI, recommendation systems, computer vision, and causal learning, highlighting the company’s transition from research to real‑world AI applications across smart cities, retail, and risk management.

AAAI 2021Computer VisionFederated Learning
0 likes · 12 min read
How JD Tech’s Breakthrough AI Papers Dominated AAAI 2021
ByteFE
ByteFE
Feb 9, 2021 · Fundamentals

Curated Self‑Study Resources for Emerging Tech Fields (Multimedia, AI, CV, RL, MT, Knowledge Graph, Mobile, Frontend)

This guide compiles recommended books, courses, and open‑source projects across multimedia, artificial intelligence, computer vision, reinforcement learning, machine translation, knowledge graphs, Android, iOS, and frontend development to help newcomers and job seekers systematically deepen their technical expertise.

Artificial IntelligenceComputer VisionResources
0 likes · 12 min read
Curated Self‑Study Resources for Emerging Tech Fields (Multimedia, AI, CV, RL, MT, Knowledge Graph, Mobile, Frontend)
iQIYI Technical Product Team
iQIYI Technical Product Team
Feb 5, 2021 · Game Development

AR+AI Powered Video Interactive Mini‑Games on iQIYI: Architecture, Face & Gesture Control, and Lua Game Layer

iQIYI’s AR+AI powered video interactive mini‑games blend a custom VideoAR engine with real‑time AI‑driven face and gesture detection, use lightweight Lua for game logic, and offer rapid hot‑updates, enabling diverse IP integrations that have attracted over a million participants and boosted viewer engagement.

AIARComputer Vision
0 likes · 12 min read
AR+AI Powered Video Interactive Mini‑Games on iQIYI: Architecture, Face & Gesture Control, and Lua Game Layer
Amap Tech
Amap Tech
Feb 1, 2021 · Artificial Intelligence

AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis Using In-Vehicle Video

The AMAP‑TECH competition challenged participants to infer real‑time road conditions from in‑vehicle video, prompting the authors to combine lane‑wise vehicle detection with LightGBM and later an end‑to‑end DenseNet‑GRU model, augment data, ensemble five networks, and achieve a 0.7237 F1 score while outlining future deployment and research directions.

Computer VisionDeep LearningModel Deployment
0 likes · 15 min read
AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis Using In-Vehicle Video
Kuaishou Large Model
Kuaishou Large Model
Jan 28, 2021 · Artificial Intelligence

How Portrait Deformation Powers Modern Beauty Filters: Algorithms Explained

This article explores the core portrait deformation techniques behind today’s beauty and body‑shaping effects—covering affine transforms, Moving Least Squares, triangulation, liquify, offset, 3D mesh, and deep‑learning approaches—detailing their principles, implementations, and visual results in live‑streaming and short‑video apps.

AIComputer VisionImage Processing
0 likes · 13 min read
How Portrait Deformation Powers Modern Beauty Filters: Algorithms Explained
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 15, 2021 · Artificial Intelligence

How AI is Transforming Video Creation and Consumption at Scale

The article examines how iQIYI leverages AI across the video ecosystem—from intelligent material search, old‑film restoration, and voice cloning to virtual idols, XR production, and AI‑driven advertising—to boost creator efficiency, enhance user experience, and accelerate industry-wide digital transformation.

AIComputer VisionIndustry Insights
0 likes · 14 min read
How AI is Transforming Video Creation and Consumption at Scale
Amap Tech
Amap Tech
Jan 15, 2021 · Artificial Intelligence

Solution Overview of the AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis from In‑Vehicle Video Images

To tackle the AMAP‑TECH competition’s dynamic road‑condition classification from scarce, imbalanced vehicle‑video frames, the team combined YOLOv5 object detection, ResNeXt101‑based semantic embeddings, and engineered temporal detection statistics, feeding the fused features into a five‑fold LightGBM model that achieved top weighted‑F1 performance.

Computer VisionLightGBMMultimodal Learning
0 likes · 10 min read
Solution Overview of the AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis from In‑Vehicle Video Images
Didi Tech
Didi Tech
Dec 29, 2020 · Artificial Intelligence

Evolution and Challenges of Perception in L4 Autonomous Driving

The article traces L4 autonomous-driving perception from early rule-based point-cloud methods through data-driven deep-learning models to emerging self-learning, multi-task systems, and highlights four key hurdles—model generalization and explainability, robust multi-sensor fusion, real-time compute limits, and proper uncertainty handling—calling for integrated AI, engineering, and data solutions.

AIComputer VisionDeep Learning
0 likes · 12 min read
Evolution and Challenges of Perception in L4 Autonomous Driving
Meituan Technology Team
Meituan Technology Team
Dec 24, 2020 · Artificial Intelligence

Meituan Unmanned Delivery Technical Salon – AI Research on Instance Segmentation, Visual Localization, Trajectory Prediction, and Depth‑Pose Learning

On January 9, 2021, Meituan hosted an unmanned‑delivery technical salon in Beijing where experts presented cutting‑edge AI research—including the CenterMask instance‑segmentation method, 3D geometry‑aware camera localization, multi‑agent trajectory prediction with attention‑based spatio‑temporal graphs, real‑time stereo visual‑inertial odometry calibration, and self‑supervised depth‑pose learning for dynamic scenes.

AIComputer Visionautonomous driving
0 likes · 7 min read
Meituan Unmanned Delivery Technical Salon – AI Research on Instance Segmentation, Visual Localization, Trajectory Prediction, and Depth‑Pose Learning
Suning Technology
Suning Technology
Dec 17, 2020 · Artificial Intelligence

How AI Powers SuNing’s Unmanned Stores: From Face Detection to Smart Retail

This article outlines SuNing's unmanned store technology, comparing its data-driven, product selection, and customer experience advantages over traditional shops, and detailing AI-powered applications such as face detection, target tracking, image recognition, and 3D reconstruction that enable 24‑hour service, intelligent merchandising, and precise customer analytics.

AIComputer VisionData Analytics
0 likes · 24 min read
How AI Powers SuNing’s Unmanned Stores: From Face Detection to Smart Retail
DataFunTalk
DataFunTalk
Dec 9, 2020 · Artificial Intelligence

WeChat Identify: From Object Detection to Large‑Scale Image Search – Technical Overview

This article details the evolution of WeChat’s Identify product, explaining its end‑to‑end image recognition pipeline—including object detection, multi‑label classification, mobile‑side detection, large‑scale retrieval, unsupervised clustering, and system architecture—while showcasing various application scenarios such as product, plant, and landmark recognition.

Computer VisionMobile AIWeChat
0 likes · 12 min read
WeChat Identify: From Object Detection to Large‑Scale Image Search – Technical Overview
Python Crawling & Data Mining
Python Crawling & Data Mining
Dec 9, 2020 · Artificial Intelligence

Unlock 3D Human Pose Capture with FrankMocap: A Powerful Open‑Source AI Tool

FrankMocap, an open‑source AI algorithm from Facebook AI Research and HKU, delivers simultaneous 3D full‑body and hand pose estimation from a single monocular video, runs at about 9.5 FPS on a RTX 2080, and includes easy installation steps, code examples, and links to its GitHub repository and paper.

3D pose estimationComputer VisionOpen-source
0 likes · 6 min read
Unlock 3D Human Pose Capture with FrankMocap: A Powerful Open‑Source AI Tool
Top Architect
Top Architect
Dec 4, 2020 · Artificial Intelligence

Java-based ID Card OCR Project Using OpenCV, JavaCPP, and Tess4J

This article introduces a Java OCR project for ID cards that integrates OpenCV, JavaCPP, and Tess4J to perform image preprocessing, region cropping, and character recognition without requiring OpenCV installation, and details its features, encountered issues, system requirements, updates, and source repository.

Computer VisionID CardJavaCPP
0 likes · 4 min read
Java-based ID Card OCR Project Using OpenCV, JavaCPP, and Tess4J
DataFunSummit
DataFunSummit
Dec 3, 2020 · Artificial Intelligence

GAN Fundamentals, Variants, and Practical Applications in Image Style Transfer and Handwriting Font Generation

This article provides a comprehensive overview of Generative Adversarial Networks, covering their original formulation, training dynamics, loss functions, major variants such as DCGAN and WGAN, and practical implementations for image‑to‑image translation, style transfer, and handwriting font synthesis at Laiye Technology.

Computer VisionDeep LearningGAN
0 likes · 28 min read
GAN Fundamentals, Variants, and Practical Applications in Image Style Transfer and Handwriting Font Generation
Kuaishou Large Model
Kuaishou Large Model
Dec 3, 2020 · Artificial Intelligence

Kuaishou Y‑Tech’s Real‑Time, High‑Precision Facial & Body Keypoint Detection Explained

Y‑Tech’s in‑house keypoint detection system powers Kuaishou’s beauty and effect filters across live streaming, video creation, and editing by leveraging lightweight deep‑learning models, extensive multi‑scenario data collection, and specialized handling of occlusion, enabling real‑time, robust facial and body landmark tracking on diverse mobile devices.

Computer VisionDeep LearningMobile AI
0 likes · 10 min read
Kuaishou Y‑Tech’s Real‑Time, High‑Precision Facial & Body Keypoint Detection Explained
360 Quality & Efficiency
360 Quality & Efficiency
Nov 27, 2020 · Artificial Intelligence

Image Similarity Detection Methods: Hashing, Histograms, Feature Matching, BOW+K‑Means, and CNN‑Based Approaches

This article reviews common image similarity detection techniques—including hash-based methods (aHash, pHash, dHash), histogram comparison, feature matching with ORB and SIFT/SURF, bag‑of‑words with K‑Means, and CNN‑based VGG16 features—detailing their algorithms, Python implementations, performance characteristics, and practical considerations.

Computer VisionDeep LearningHashing
0 likes · 15 min read
Image Similarity Detection Methods: Hashing, Histograms, Feature Matching, BOW+K‑Means, and CNN‑Based Approaches
Suning Technology
Suning Technology
Nov 26, 2020 · Artificial Intelligence

How Low-Cost AI Powers Full-Scale Store Digitalization

Li Yongxiang, technical director at Suning Tech, outlines how AI-driven visual unmanned stores and integrated big‑data, cloud, and edge computing solutions enable low‑cost digital transformation across thousands of retail outlets, improving shopper experience, inventory management, and operational efficiency.

AIComputer VisionEdge Computing
0 likes · 18 min read
How Low-Cost AI Powers Full-Scale Store Digitalization
DataFunTalk
DataFunTalk
Nov 22, 2020 · Artificial Intelligence

Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan

This article presents Meituan's AI-driven short video analysis workflow, covering industry trends, multi‑label video classification, intelligent cover selection, and video generation techniques, while discussing challenges, model building, label expansion, continuous data iteration, and future outlook for video AI in local services.

AIComputer VisionMeituan
0 likes · 16 min read
Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan
DeWu Technology
DeWu Technology
Nov 18, 2020 · Artificial Intelligence

AR Fundamentals and Shoe Try‑On Implementation

The presentation explains AR fundamentals, distinguishes it from AI and VR, and details a shoe‑try‑on system that captures 30 fps video, uses AI key‑point detection and pose estimation to overlay 3D shoe models—created via manual, scanning, or photogrammetry methods—rendered with GPU pipelines and PBR, enhanced by green‑screen occlusion and shadow techniques, earning positive audience feedback.

3D modelingARComputer Vision
0 likes · 7 min read
AR Fundamentals and Shoe Try‑On Implementation
DataFunTalk
DataFunTalk
Nov 10, 2020 · Artificial Intelligence

Low‑Power ADAS on Didi’s JueShi Devices Reduces Traffic Accidents

This article describes how Didi’s vehicle‑vision team built an ultra‑low‑power ADAS solution on the JueShi dash‑cam platform, using lightweight detection models, temporal fusion, camera‑calibration techniques and data‑driven optimization to cut rear‑end collision rates by over 11% and improve overall traffic safety.

ADASComputer VisionEdge Computing
0 likes · 15 min read
Low‑Power ADAS on Didi’s JueShi Devices Reduces Traffic Accidents
Didi Tech
Didi Tech
Nov 9, 2020 · Artificial Intelligence

Ultra-Low-Power ADAS on DiDi's JueShi Devices for Reducing Traffic Accidents

DiDi’s ultra‑low‑power JueShi ADAS combines lightweight vision models, temporal‑fusion Kalman filtering, and camera‑calibration techniques to deliver real‑time forward‑collision warnings and brake‑light alerts, cutting rear‑end crashes by over 11% and overall accidents by 9% through continuous edge‑AI learning.

ADASComputer VisionEdge Computing
0 likes · 15 min read
Ultra-Low-Power ADAS on DiDi's JueShi Devices for Reducing Traffic Accidents
New Oriental Technology
New Oriental Technology
Nov 9, 2020 · Artificial Intelligence

Understanding YOLOv4 and YOLOv5: Core Elements and Innovations in Object Detection

This article introduces the fundamentals of object detection, explains the latest YOLOv4 and YOLOv5 architectures, and details the essential components—including data preparation, regularization, backbone, neck, and prediction innovations—along with label smoothing and advanced loss functions for improved detection performance.

AIComputer VisionYOLOv4
0 likes · 9 min read
Understanding YOLOv4 and YOLOv5: Core Elements and Innovations in Object Detection
21CTO
21CTO
Nov 3, 2020 · Artificial Intelligence

How Does Image Recognition Work? A Simple Guide to Core Principles

This article explains the fundamental principles of image recognition, covering how images are converted to numeric arrays, processed by scanning matrix blocks, and matched against patterns to identify objects such as text, faces, cats, dogs, or mice.

AI basicsComputer VisionConvolution
0 likes · 4 min read
How Does Image Recognition Work? A Simple Guide to Core Principles
iQIYI Technical Product Team
iQIYI Technical Product Team
Oct 16, 2020 · Artificial Intelligence

Cartoon Face Recognition: Introducing the iCartoonFace Benchmark Dataset

iQIYI’s ACM Multimedia‑accepted paper unveils iCartoonFace, the world’s largest manually annotated cartoon‑face dataset—over 5,000 characters and 400,000 real‑scene images—accompanied by a semi‑automatic collection pipeline and multi‑person training framework, now powering AI services, large‑scale contests and accelerating cartoon‑character recognition research.

Artificial IntelligenceCartoon Face RecognitionComputer Vision
0 likes · 4 min read
Cartoon Face Recognition: Introducing the iCartoonFace Benchmark Dataset
Didi Tech
Didi Tech
Oct 16, 2020 · Artificial Intelligence

Mask Detection System and Visual AI Competition Achievements

Didi’s COVID‑19 mask‑detection system, built on a DFS‑based face detector and an attention‑enhanced ResNet‑50 mask classifier achieving over 99.5 % accuracy, has been deployed in vehicles, open‑sourced, and complemented by top‑ranked results in international visual AI contests, including first place in driver‑gaze prediction and podium finishes in emotion recognition and model‑compression challenges.

AIComputer VisionDeep Learning
0 likes · 22 min read
Mask Detection System and Visual AI Competition Achievements
Kuaishou Large Model
Kuaishou Large Model
Oct 15, 2020 · Artificial Intelligence

How Kuaishou’s Y‑Tech Advances Monocular Depth Estimation for Mobile AR

This article reviews Kuashou Y‑Tech’s ECCV‑2020 paper on monocular depth estimation, detailing its novel GCB‑SAB network, new HC‑Depth dataset, specialized loss functions and edge‑aware training, and demonstrates superior performance on NYUv2, TUM and real‑world mobile AR applications.

Attention MechanismComputer VisionDeep Learning
0 likes · 14 min read
How Kuaishou’s Y‑Tech Advances Monocular Depth Estimation for Mobile AR
360 Quality & Efficiency
360 Quality & Efficiency
Sep 18, 2020 · Artificial Intelligence

Data Augmentation Techniques for Improving Object Detection Model Robustness

To enhance object detection robustness, the article discusses various data augmentation methods—including rotation, flipping, random cropping, scaling, color jitter, blurring, transparency adjustment, and image partitioning—providing code examples and illustrating their impact on model performance with before‑and‑after results.

Computer VisionPythondata augmentation
0 likes · 7 min read
Data Augmentation Techniques for Improving Object Detection Model Robustness
Suning Technology
Suning Technology
Sep 17, 2020 · Artificial Intelligence

How SuNing’s Fourth‑Gen Digital Visual Unmanned Store Redefines AI‑Powered Retail

SuNing’s fourth‑generation fully digital visual unmanned store combines 3D reconstruction, AI‑driven perception, and modular hardware‑software design to achieve real‑time, all‑scene, all‑time, all‑digital analysis of people, goods, and spaces, enabling precise offline marketing and scalable retail digitization.

AIComputer VisionDigital Twin
0 likes · 27 min read
How SuNing’s Fourth‑Gen Digital Visual Unmanned Store Redefines AI‑Powered Retail
Suning Technology
Suning Technology
Sep 3, 2020 · Artificial Intelligence

How Suning’s Fourth‑Gen AI‑Powered Visual Unmanned Stores Transform Retail

Suning’s lecture series details the three‑decade evolution of retail, the company’s 30‑year digital transformation, and the technical architecture of its fourth‑generation fully digital visual unmanned stores that leverage AI, computer vision, and big‑data analytics to revolutionize in‑store operations and customer experience.

AIComputer VisionDigital Store
0 likes · 14 min read
How Suning’s Fourth‑Gen AI‑Powered Visual Unmanned Stores Transform Retail
Zhengtong Technical Team
Zhengtong Technical Team
Aug 14, 2020 · Artificial Intelligence

ZTFace: A High‑Precision, Fast Face Recognition Algorithm

This article presents ZTFace, an end‑to‑end face recognition solution that integrates face detection, alignment, feature embedding, verification, anti‑spoofing and attribute recognition using deep learning, details its backbone networks, loss functions, training datasets, experimental results on WIDER FACE and LFW, and demonstrates acceleration with TensorRT.

Computer VisionTensorRTZTFace
0 likes · 17 min read
ZTFace: A High‑Precision, Fast Face Recognition Algorithm
360 Tech Engineering
360 Tech Engineering
Aug 7, 2020 · Artificial Intelligence

Guide to Image Matching: Template Matching, Feature Matching with SIFT and FLANN, and Homography

This guide explains image matching techniques, covering template matching with OpenCV, various matching methods, SIFT feature extraction and description, FLANN-based nearest neighbor matching, homography estimation, practical challenges, and a brief overview of YOLO training, providing code examples and visual illustrations.

Computer VisionFLANNFeature Matching
0 likes · 15 min read
Guide to Image Matching: Template Matching, Feature Matching with SIFT and FLANN, and Homography
iQIYI Technical Product Team
iQIYI Technical Product Team
Aug 7, 2020 · Artificial Intelligence

Boundary Content Graph Neural Network (BC‑GNN) for Temporal Action Proposal Generation

The Boundary Content Graph Neural Network (BC‑GNN) introduces a bipartite‑graph framework that jointly refines start/end boundary probabilities and segment‑content confidence, enabling more precise temporal action proposals and achieving state‑of‑the‑art results on ActivityNet‑1.3 and THUMOS14.

BC-GNNComputer VisionDeep Learning
0 likes · 10 min read
Boundary Content Graph Neural Network (BC‑GNN) for Temporal Action Proposal Generation
Amap Tech
Amap Tech
Jul 30, 2020 · Artificial Intelligence

Evolution and Practice of Scene Text Recognition Technology in Amap Map Data Production

Amap uses advanced scene text recognition combining detection and recognition modules, deep learning, data synthesis, and result fusion to automate map data production, achieving state-of-the-art performance and automating the majority of POI and road updates, significantly reducing labor costs.

Computer VisionDeep LearningOCR
0 likes · 18 min read
Evolution and Practice of Scene Text Recognition Technology in Amap Map Data Production
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 30, 2020 · Artificial Intelligence

How Amap’s Scene Text Recognition Powers Accurate Maps: Evolution and Future Challenges

This article explains how Amap leverages scene text recognition to automate map data production, detailing the evolution from traditional image algorithms to deep‑learning models, the current detection and recognition framework, performance results, and future research directions for handling blur, data scarcity, and semantic understanding.

AmapComputer VisionDeep Learning
0 likes · 19 min read
How Amap’s Scene Text Recognition Powers Accurate Maps: Evolution and Future Challenges
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 29, 2020 · Artificial Intelligence

How Gaode Maps Boosts Accuracy with Advanced Scene Text Recognition

This article explains how Gaode Maps leverages traditional and deep‑learning based scene text recognition techniques—including character detection, sequence models, data synthesis, and multi‑stage frameworks—to automate POI and road data production with high precision and speed.

Computer VisionDeep LearningOCR
0 likes · 20 min read
How Gaode Maps Boosts Accuracy with Advanced Scene Text Recognition
Youku Technology
Youku Technology
Jul 29, 2020 · Artificial Intelligence

Core Technology of Video Content Understanding: Technical Practice of Partial Re-ID in Video Inspection

The talk explains how Alibaba’s Entertainment Content Operation Platform applies a Partial‑ReID algorithm to overcome the challenges of person re‑identification in heavily edited video content, enabling accurate cross‑shot character matching, richer appearance data, and metrics such as presence, interaction, and storyline for improved video quality assessment.

AIComputer VisionPartial Re-ID
0 likes · 2 min read
Core Technology of Video Content Understanding: Technical Practice of Partial Re-ID in Video Inspection
NetEase Media Technology Team
NetEase Media Technology Team
Jul 24, 2020 · Artificial Intelligence

Survey of Video Action Recognition Algorithms: 3D and 2D Convolutional Networks and Pre‑training

This survey reviews video action recognition, comparing 3D convolutional networks that jointly model spatial‑temporal cues but are computationally heavy with 2D‑based approaches like TSM and TIN that embed temporal shifts efficiently, and emphasizes how large‑scale pre‑training markedly improves performance despite limited labeled data.

2D convolutional networks3D convolutional networksComputer Vision
0 likes · 13 min read
Survey of Video Action Recognition Algorithms: 3D and 2D Convolutional Networks and Pre‑training
Sohu Tech Products
Sohu Tech Products
Jul 22, 2020 · Artificial Intelligence

Face Detection Using Haar Features and AdaBoost with OpenCV

This article explains the principles and implementation of face detection based on statistical methods, detailing Haar feature types, integral image computation, feature normalization, cascade classifiers, and provides step‑by‑step OpenCV code examples for static images, eye detection, and real‑time webcam detection.

AdaBoostComputer VisionFace Detection
0 likes · 19 min read
Face Detection Using Haar Features and AdaBoost with OpenCV
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 13, 2020 · Artificial Intelligence

Master Dynamic Road Condition Analysis with Car Video – AMAP-TECH Competition Overview

The AMAP-TECH algorithm competition invites participants to develop AI models that analyze in-vehicle video sequences to determine dynamic road conditions, offering detailed dataset specifications, evaluation metrics, expert judges, schedule, and prize information for researchers in computer vision and traffic analytics.

AIComputer VisionDataset
0 likes · 9 min read
Master Dynamic Road Condition Analysis with Car Video – AMAP-TECH Competition Overview
Youku Technology
Youku Technology
Jul 10, 2020 · Artificial Intelligence

Mastering Video Object Segmentation: Cutting-Edge Models and Design Tricks

This technical talk introduces video object segmentation tasks, reviews leading datasets and state-of-the-art deep learning models, and shares practical network design rules and performance‑boosting techniques, presented by Prof. Wang Xinggang as part of Alibaba's MEDIA AI challenge series.

AIComputer VisionDeep Learning
0 likes · 4 min read
Mastering Video Object Segmentation: Cutting-Edge Models and Design Tricks
Amap Tech
Amap Tech
Jul 9, 2020 · Artificial Intelligence

AMAP-TECH Algorithm Competition: Dynamic Road‑Condition Analysis from In‑Vehicle Video Images

Alibaba Amap’s AMAP‑TECH competition invites participants to develop AI computer‑vision models that classify real‑time road conditions—smooth, slow, or congested—from short sequences of dash‑cam images, using a labeled dataset of 1,500 training sequences and a weighted F1‑score evaluation, with cash prizes up to ¥60,000.

AIComputer VisionDataset
0 likes · 8 min read
AMAP-TECH Algorithm Competition: Dynamic Road‑Condition Analysis from In‑Vehicle Video Images
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 3, 2020 · Artificial Intelligence

Unlocking Visual Object Tracking: Principles, Algorithms, and Evaluation

This comprehensive review explains visual object tracking in computer vision, covering its definition, core sub‑problems of candidate generation, feature extraction, and decision making, system architecture, motion, feature and observation models, algorithm classifications, evaluation metrics, datasets, and recent research trends.

Computer VisionDeep Learningevaluation metrics
0 likes · 30 min read
Unlocking Visual Object Tracking: Principles, Algorithms, and Evaluation
Youku Technology
Youku Technology
Jun 19, 2020 · Artificial Intelligence

Video-based Temporal Event Detection Methods

In the fourth Alibaba Digital Media Technology Night Talk, algorithm engineer Liu Xiaolong presents an overview of video‑based temporal event detection, covering its problem background, representative prior works, and the latest research advances within the MEDIA AI Algorithm Challenge series.

AlibabaArtificial IntelligenceComputer Vision
0 likes · 1 min read
Video-based Temporal Event Detection Methods
TAL Education Technology
TAL Education Technology
Jun 18, 2020 · Artificial Intelligence

An Overview of Virtual Reality, Augmented Reality, and Vision‑Based Techniques

This article explains the fundamentals of virtual reality and its distinction from augmented reality, describes VR hardware, outlines depth‑estimation and eye‑tracking methods such as projection, Hough transform, AdaBoost and sample matching, discusses Sobel edge detection, and explores the importance of audio, haptic feedback, and immersive VR applications in education.

ARComputer VisionDepth estimation
0 likes · 11 min read
An Overview of Virtual Reality, Augmented Reality, and Vision‑Based Techniques
360 Quality & Efficiency
360 Quality & Efficiency
May 29, 2020 · Artificial Intelligence

Image Matching Techniques: Template Matching, Feature Matching, SIFT, FLANN, and Homography

This article introduces image matching fundamentals, covering template matching methods, feature-based approaches such as SIFT and FLANN, their implementation details, matching rules, homography transformation, and practical considerations, providing a comprehensive overview for computer vision applications.

Computer VisionFLANNFeature Matching
0 likes · 14 min read
Image Matching Techniques: Template Matching, Feature Matching, SIFT, FLANN, and Homography
JD Retail Technology
JD Retail Technology
May 27, 2020 · Artificial Intelligence

JD ARVR Tech Department Publishes Two Papers on Defocus Blur Detection and Few-Shot Learning in Top Venues

The JD ARVR technology department announced two peer‑reviewed papers—one on a novel defocus blur detection network published in Transaction on Multimedia and another on a transductive relation‑propagation network for few‑shot learning accepted at IJCAI 2020—highlighting their advanced AI research and future AR‑VR ecosystem plans.

ARVRComputer VisionDeep Learning
0 likes · 7 min read
JD ARVR Tech Department Publishes Two Papers on Defocus Blur Detection and Few-Shot Learning in Top Venues
Amap Tech
Amap Tech
May 25, 2020 · Artificial Intelligence

Automated Production Line for Base Map Data Using Image AI and Data Fusion

Gaode’s automated production line combines deep‑learning image recognition, GPS‑enhanced location services, image differencing with semantic filtering, and standardized data‑fusion to continuously refresh China’s national base map, cutting manual effort and costs while delivering real‑time, high‑quality map updates for road traffic infrastructure.

Computer VisionDeep Learningdata fusion
0 likes · 11 min read
Automated Production Line for Base Map Data Using Image AI and Data Fusion
ITPUB
ITPUB
May 14, 2020 · Artificial Intelligence

Cut & Paste Real Objects into Photoshop with AR in Under 10 Seconds

This article explains the AR Cut & Paste prototype by Cyril Diagne, detailing its three‑module architecture, the underlying BASNet and U²‑Net vision models, and provides a step‑by‑step guide—including code snippets and GitHub links—to set up the mobile app, local server, and Photoshop integration.

ARBASNetComputer Vision
0 likes · 8 min read
Cut & Paste Real Objects into Photoshop with AR in Under 10 Seconds
Python Programming Learning Circle
Python Programming Learning Circle
May 12, 2020 · Artificial Intelligence

Batch Image Segmentation with Python and PaddlePaddle

This tutorial demonstrates how to use Python and the PaddlePaddle deep‑learning platform to automatically remove backgrounds from multiple photos in one step, covering installation, verification, and a concise five‑line code example for batch human segmentation.

Batch ProcessingComputer VisionDeep Learning
0 likes · 6 min read
Batch Image Segmentation with Python and PaddlePaddle
Programmer DD
Programmer DD
May 9, 2020 · Artificial Intelligence

ChineseOCR Lite: Ultra‑Lightweight OCR Engine for Vertical Chinese Text

ChineseOCR Lite is an open‑source, ultra‑lightweight OCR solution that supports vertical Chinese text, runs on Linux/macOS via ncnn inference, and packs detection, recognition, and angle classification models into a total of just 17 MB, offering fast and accurate scene‑text processing.

Chinese OCRComputer VisionOCR
0 likes · 4 min read
ChineseOCR Lite: Ultra‑Lightweight OCR Engine for Vertical Chinese Text
Didi Tech
Didi Tech
Apr 30, 2020 · Artificial Intelligence

DGF-M: Face Recognition Algorithm for Masked Face Scenarios

Didi’s DGF‑M model, a mask‑aware face‑recognition AI, combines multi‑task training and synthetic data to detect masks with under 0.1 % miss rate and verify identities with up to 99.5 % pass rate at a 0.1 % false‑acceptance rate, and is deployed for driver verification, offered through the Didi Cloud API marketplace, and released as an open‑source solution to aid pandemic‑era security.

AI algorithmComputer VisionDGF-M
0 likes · 5 min read
DGF-M: Face Recognition Algorithm for Masked Face Scenarios
Amap Tech
Amap Tech
Apr 24, 2020 · Artificial Intelligence

Q&A on Computer Vision Technologies and Their Applications in Mapping, Navigation, and Autonomous Driving

In a live Q&A, Alibaba Amap’s chief scientist Ren Xiaofeng explained how computer‑vision algorithms underpin high‑precision map creation, AR navigation, visual localization and sensor fusion, discussed current hardware limits, deep‑learning bottlenecks, 5G’s role, edge‑cloud cooperation, and offered career advice for transitioning researchers.

AIAR navigationComputer Vision
0 likes · 14 min read
Q&A on Computer Vision Technologies and Their Applications in Mapping, Navigation, and Autonomous Driving
Programmer DD
Programmer DD
Apr 17, 2020 · Artificial Intelligence

How to Make People Vanish in Real‑Time Using TensorFlow.js and MobileNet

Jason Mayes, a Google web engineer, open‑sourced a TensorFlow.js demo that removes people from live webcam video in real time using a lightweight MobileNet model, with only about 200 lines of code, and provides GitHub and CodePen links for experimentation.

Computer VisionMobileNetReal-time Video
0 likes · 9 min read
How to Make People Vanish in Real‑Time Using TensorFlow.js and MobileNet
iQIYI Technical Product Team
iQIYI Technical Product Team
Apr 3, 2020 · Artificial Intelligence

iCartoonFace Challenge: Cartoon Face Detection and Recognition Competition

The iCartoonFace Challenge invites participants to develop efficient algorithms for detecting and recognizing cartoon faces using large, meticulously annotated datasets—50,000 images for detection and nearly 390,000 for recognition—while meeting strict model size and latency limits and submitting detailed methods and code.

AI competitionCartoon Face RecognitionComputer Vision
0 likes · 6 min read
iCartoonFace Challenge: Cartoon Face Detection and Recognition Competition
JD Retail Technology
JD Retail Technology
Apr 2, 2020 · Artificial Intelligence

How Deep Learning Powers Text Detection in E‑commerce Posters

This article surveys state‑of‑the‑art deep‑learning techniques for scene text detection and recognition in e‑commerce poster images, detailing models such as CTPN, TextBoxes, SegLink, EAST, and end‑to‑end frameworks, and discusses their architectures, strengths, limitations, and future challenges.

Computer VisionDeep Learninge‑commerce
0 likes · 16 min read
How Deep Learning Powers Text Detection in E‑commerce Posters