Tagged articles
650 articles
Page 6 of 7
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 25, 2020 · Artificial Intelligence

How 3D Synthetic Data Supercharges AI Vision for Smart Vending Machines

This article explains how Alibaba's Alipay visual vending cabinet leverages 3D synthetic data generation—covering full‑material 3D reconstruction, parametric scene modeling, and photo‑realistic rendering—to rapidly produce high‑quality training images, dramatically cutting cost and accelerating AI model deployment.

3D synthesisAI training dataComputer Vision
0 likes · 10 min read
How 3D Synthetic Data Supercharges AI Vision for Smart Vending Machines
Amap Tech
Amap Tech
Mar 23, 2020 · Artificial Intelligence

Satellite Imagery for Map Data Updating: Key Elements, Semantic Segmentation Techniques, and Future Challenges

Gaode leverages high‑resolution satellite imagery as an active discovery tool for map updates, extracting road, region and building elements through advanced semantic segmentation networks (U‑Net, ASPP, attention, non‑local) and instance‑segmentation pipelines, to accelerate accurate road‑network and building‑block data refreshes while addressing future scalability challenges.

Computer VisionSatellite ImageryU-Net
0 likes · 11 min read
Satellite Imagery for Map Data Updating: Key Elements, Semantic Segmentation Techniques, and Future Challenges
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 10, 2020 · Artificial Intelligence

Can Frequency‑Domain Learning Boost Image Inference Efficiency?

This article presents a system‑level approach that performs deep‑learning inference directly on JPEG frequency components, uses a gating mechanism to select important DCT coefficients, and demonstrates higher accuracy with far lower bandwidth for image classification and instance segmentation tasks.

Bandwidth ReductionComputer VisionDeep Learning
0 likes · 22 min read
Can Frequency‑Domain Learning Boost Image Inference Efficiency?
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 25, 2020 · Artificial Intelligence

How Attribute‑Specific Embedding Networks Revolutionize Fashion Copyright Protection

A new AI algorithm jointly developed by Alibaba Security and Zhejiang University learns fine‑grained, attribute‑aware similarity embeddings for fashion images, enabling accurate detection of local design plagiarism and improving retrieval performance across multiple benchmark datasets.

Computer VisionDeep Learningattribute embedding
0 likes · 14 min read
How Attribute‑Specific Embedding Networks Revolutionize Fashion Copyright Protection
UCloud Tech
UCloud Tech
Feb 20, 2020 · Artificial Intelligence

How UCloud’s AI Mask Detection Service Reaches 99% Accuracy in One Week

This article explains how UCloud’s AI team leveraged the UAI‑Train and UAI‑Inference platforms to develop, train, and deploy a high‑accuracy face‑mask detection service within a week, detailing the algorithmic approach, challenges, deployment pipeline, and real‑world applications.

AICloud AIComputer Vision
0 likes · 10 min read
How UCloud’s AI Mask Detection Service Reaches 99% Accuracy in One Week
DataFunTalk
DataFunTalk
Feb 13, 2020 · Artificial Intelligence

Deep Learning Techniques and Challenges in Autonomous Driving

This article reviews the rapid development of deep learning, its pivotal role in autonomous driving, outlines end‑to‑end perception‑to‑control pipelines, discusses the strengths and limitations of deep models, and proposes practical strategies such as task decomposition, multi‑method fusion, and sensor integration to improve safety and interpretability.

Computer VisionDeep LearningEnd-to-End
0 likes · 8 min read
Deep Learning Techniques and Challenges in Autonomous Driving
ITPUB
ITPUB
Jan 14, 2020 · Artificial Intelligence

Top 2019 AI Papers Loved by Reddit Users: Key Insights and Links

A curated collection of Reddit‑highlighted 2019 AI research papers, covering theoretical advances, computer‑vision breakthroughs, unsupervised learning methods, and time‑series forecasting, with summaries, key contributions, and direct links to each paper.

AIComputer VisionMeta Learning
0 likes · 6 min read
Top 2019 AI Papers Loved by Reddit Users: Key Insights and Links
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 10, 2020 · Artificial Intelligence

How AI Powers Ground Marker Recognition for High‑Precision Maps

This article details the evolution of ground‑marker recognition technology in high‑precision maps, covering challenges of diverse and worn markings, traditional segmentation methods, deep‑learning breakthroughs such as R‑FCN, cascade detectors, corner‑point detection, semantic segmentation, PAnet, and 3‑D point‑cloud approaches, and their impact on accuracy and production efficiency.

Computer VisionDeep Learningground marker recognition
0 likes · 17 min read
How AI Powers Ground Marker Recognition for High‑Precision Maps
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 9, 2020 · Artificial Intelligence

Results and Winning Solutions of the 2019 CCF Big Data & Computing Intelligence Contest – Video Copyright Detection Track

The 2019 CCF Big Data & Computing Intelligence Contest’s Video Copyright Detection track, judged by iQIYI, saw 705 teams from 25 countries compete, with Hengyang Data’s VGG‑16‑based solution winning, followed by Boyun Vision, Xiao Jia’s Lao Liang, Hulu Brothers and Beihang University, showcasing diverse deep‑learning and unsupervised approaches for robust video copyright detection.

CCF ContestComputer VisionDeep Learning
0 likes · 9 min read
Results and Winning Solutions of the 2019 CCF Big Data & Computing Intelligence Contest – Video Copyright Detection Track
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 3, 2020 · Artificial Intelligence

How Alibaba’s DAMO Lab Revolutionizes Image Cutout with AI‑Powered Matting

Alibaba's DAMO Academy details its AI‑driven image cutout system, describing why automated matting is needed, the four‑module pipeline (filtering, classification, detection, segmentation), architectural innovations such as dual decoders and fusion networks, and how these advances enable product‑level batch background removal.

AIAlibabaComputer Vision
0 likes · 9 min read
How Alibaba’s DAMO Lab Revolutionizes Image Cutout with AI‑Powered Matting
Tencent Cloud Developer
Tencent Cloud Developer
Dec 26, 2019 · Artificial Intelligence

WeChat Scan-to-Identify (Scan Object) Feature: Overview, Technical Architecture, Data Construction, and Algorithmic Advances

WeChat’s iOS Scan‑to‑Identify feature lets users point a camera at any product or scene to instantly retrieve related e‑commerce, encyclopedia or news content, using a four‑pipeline architecture that builds massive annotated and deduplicated databases, advanced RetinaNet‑based detection, multi‑task metric learning, and scalable training, deployment and scheduling platforms, with plans to extend into domains like facial, vehicle and plant recognition.

AIComputer VisionWeChat
0 likes · 34 min read
WeChat Scan-to-Identify (Scan Object) Feature: Overview, Technical Architecture, Data Construction, and Algorithmic Advances
Tencent Cloud Developer
Tencent Cloud Developer
Dec 19, 2019 · Artificial Intelligence

AI-Powered Content Moderation: How Platforms Combat Harmful Content with AI

AI-powered moderation tools now scan text, images, live streams, and short videos, using techniques like TextCNN, Word2Vec, attention‑based classifiers, multi‑label sampling, and real‑time audio analysis to detect pornographic and harmful content, while emphasizing continual model updates and sample collection for both small and large platforms.

AI detectionComputer VisionTencent Security
0 likes · 12 min read
AI-Powered Content Moderation: How Platforms Combat Harmful Content with AI
Amap Tech
Amap Tech
Dec 13, 2019 · Artificial Intelligence

Image Segmentation for High-Definition Mapping: Evolution and Practices at Gaode Maps

Gaode Maps has progressed image segmentation from early heuristic region splitting to modern deep‑learning pipelines—leveraging FCNs, multi‑task networks, Mask R‑CNN, and specialized losses—to achieve centimeter‑level, instance‑aware mapping of roads, signs, and small objects while pursuing lighter, real‑time models.

AIComputer VisionDeep Learning
0 likes · 14 min read
Image Segmentation for High-Definition Mapping: Evolution and Practices at Gaode Maps
Xianyu Technology
Xianyu Technology
Dec 11, 2019 · Artificial Intelligence

Improving Small Object Detection for UI2CODE via Data Augmentation and Model Optimization

The study enhances UI2CODE’s ability to detect tiny UI components by augmenting training data with copied small objects, upgrading the detector from Faster RCNN to FPN and Cascade FPN, and refining box positions with smoothing and projection, achieving superior small‑object mAP/mAR and enabling broader UI parsing applications.

Computer VisionFPNModel Optimization
0 likes · 9 min read
Improving Small Object Detection for UI2CODE via Data Augmentation and Model Optimization
Qunar Tech Salon
Qunar Tech Salon
Dec 10, 2019 · Artificial Intelligence

Comprehensive Overview of Face Detection Methods and Techniques

This article provides an in‑depth review of face detection, covering traditional knowledge‑, model‑, feature‑ and appearance‑based approaches, modern deep‑learning methods such as cascade CNN, MTCNN and Facebox, strategies for handling multi‑scale faces, anchor‑box densification, and practical training considerations.

CNNCascade CNNComputer Vision
0 likes · 10 min read
Comprehensive Overview of Face Detection Methods and Techniques
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 22, 2019 · Artificial Intelligence

Analysis of ICCV 2019 Lightweight Face Recognition Challenge Champion Solutions

The ICCV 2019 Lightweight Face Recognition Challenge attracted 292 teams and defined four strict FLOP‑ and size‑limited protocols for image and video recognition, with champions employing near‑30 GFLOP EfficientNet‑style backbones, novel loss functions, frame‑fusion, and knowledge‑distilled VarGNet models to balance accuracy and computational budget.

Computer VisionDeep LearningICCV Challenge
0 likes · 8 min read
Analysis of ICCV 2019 Lightweight Face Recognition Challenge Champion Solutions
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 19, 2019 · Artificial Intelligence

How Visual AI Powers Real-World Mapping and AR Navigation at Amap

This article explains how Amap leverages computer vision to collect, process, and enhance map data and to deliver low‑cost, real‑time AR navigation, detailing the technical challenges, algorithmic solutions, and the broader mission of connecting the physical world.

AIAR navigationComputer Vision
0 likes · 12 min read
How Visual AI Powers Real-World Mapping and AR Navigation at Amap
MaGe Linux Operations
MaGe Linux Operations
Nov 15, 2019 · Artificial Intelligence

How AI Video Walls Are Transforming Indian Prisons: Inside the JARVIS Surveillance System

India’s prisons are adopting AI-powered video walls and facial‑recognition systems, such as Staqu’s JARVIS platform, to monitor inmate activity, improve security, and generate revenue, while confronting overcrowding, staffing shortages, and violent incidents, illustrating a global shift toward smart‑prison technology.

AI surveillanceComputer VisionIndia
0 likes · 9 min read
How AI Video Walls Are Transforming Indian Prisons: Inside the JARVIS Surveillance System
DataFunTalk
DataFunTalk
Nov 14, 2019 · Artificial Intelligence

Sample Imbalance and Importance in Object Detection: IoU‑Balanced Sampling and Prime Sample Attention

The talk analyzes sample imbalance and importance in object detection, proposes IoU‑balanced negative sampling and instance‑balanced positive sampling, introduces the Prime Sample concept with Hierarchical Local Rank, and presents Importance‑based Sample Reweighting and Classification‑Aware Regression Loss, achieving consistent mAP gains without extra overhead.

Computer VisionIoU-balanced samplingMAP
0 likes · 22 min read
Sample Imbalance and Importance in Object Detection: IoU‑Balanced Sampling and Prime Sample Attention
Amap Tech
Amap Tech
Nov 14, 2019 · Artificial Intelligence

Technical Evolution of Ground Marking Recognition for High‑Precision Maps

AMap’s ground‑marking recognition has progressed from simple threshold methods to advanced deep‑learning pipelines—including two‑stage R‑FCN, cascade detectors with local regression, corner‑point and segmentation hybrids, and LiDAR‑based 3‑D PointRCNN—achieving over 99 % recall and sub‑5 cm positional accuracy for high‑precision map production.

Computer VisionDeep Learningground marking
0 likes · 15 min read
Technical Evolution of Ground Marking Recognition for High‑Precision Maps
Baidu App Technology
Baidu App Technology
Oct 30, 2019 · Artificial Intelligence

Applying Deep Learning and AI on Mobile: Baidu App Cases and Technical Insights

The Baidu App team showcases how deep‑learning and AI can be deployed on mobile through on‑device and server‑side inference—illustrated by plant‑identification, stylized filters, video subject detection, and AR real‑time translation—while addressing model compression, cross‑platform optimization, and offering a practical guide for engineers.

AR TranslationComputer VisionDeep Learning
0 likes · 11 min read
Applying Deep Learning and AI on Mobile: Baidu App Cases and Technical Insights
Amap Tech
Amap Tech
Oct 23, 2019 · Artificial Intelligence

AR Navigation Lane Detection: Methods, Challenges, and Practical Solutions

The article reviews AR navigation lane‑detection, comparing traditional handcrafted visual pipelines with modern deep‑learning segmentation approaches, proposes an efficient multitask network with weight‑allocation and vanishing‑point anchoring, and demonstrates quantized models achieving real‑time, stable performance on low‑power automotive chips while outlining remaining weather, lighting, and road‑condition challenges.

ADASAR navigationComputer Vision
0 likes · 16 min read
AR Navigation Lane Detection: Methods, Challenges, and Practical Solutions
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Oct 15, 2019 · Artificial Intelligence

How ModelArts Powers AI Development and Seamless Edge‑Cloud Deployment

This article reviews Huawei's ModelArts platform, detailing its data processing, algorithm development, high‑performance training, edge‑cloud model deployment, auto‑learning capabilities, and real‑world use cases such as invisible payment and intelligent waste classification, while outlining future ecosystem prospects.

AI PlatformAutoMLComputer Vision
0 likes · 14 min read
How ModelArts Powers AI Development and Seamless Edge‑Cloud Deployment
DataFunTalk
DataFunTalk
Sep 29, 2019 · Artificial Intelligence

UC Information Flow Video Tag Recognition: System Architecture and Multi‑Modal Algorithms

This article presents a comprehensive overview of UC's information‑flow video tag recognition technology, detailing tag usage scenarios, the end‑to‑end system architecture, multi‑modal feature extraction, advanced deep‑learning models such as NextVlad, behavior and person tagging methods, and future research directions.

Computer VisionDeep LearningMultimodal Learning
0 likes · 14 min read
UC Information Flow Video Tag Recognition: System Architecture and Multi‑Modal Algorithms
Meituan Technology Team
Meituan Technology Team
Sep 26, 2019 · Artificial Intelligence

Efficient Scene Text Detection Framework with Feature Pyramid and Expanded High-Level Feature Maps

The paper presents an efficient scene‑text detector that expands high‑level SSD feature maps and integrates a feature‑pyramid network, using direction‑aware segment‑and‑link predictions to reconstruct arbitrarily long, rotated text, achieving higher recall and precision with real‑time speed and outperforming recent methods on ICDAR benchmarks and a menu‑recognition test.

Computer VisionDeep LearningICDAR
0 likes · 12 min read
Efficient Scene Text Detection Framework with Feature Pyramid and Expanded High-Level Feature Maps
Didi Tech
Didi Tech
Sep 20, 2019 · Mobile Development

How Didi Maps Engineered Scalable AR Navigation for Airports and Malls

Didi Maps' chief engineer explains how the team tackled weak GPS signals in large indoor venues by building a 60,000‑square‑meter 3D map, achieving sub‑0.5 m monocular visual localization, and fusing inertial data with Google ARCore to deliver real‑time AR navigation on Android devices.

AR navigationComputer VisionDidi Maps
0 likes · 5 min read
How Didi Maps Engineered Scalable AR Navigation for Airports and Malls
Tencent Cloud Developer
Tencent Cloud Developer
Sep 19, 2019 · Artificial Intelligence

Inside Tencent Cloud OCR: Architecture, Performance, and Integration Guide

The article provides a comprehensive overview of Tencent Cloud’s OCR platform, detailing its service architecture, product capabilities, integration methods, performance metrics, engineering improvements, testing automation, and operational considerations, offering developers practical insights into building and deploying OCR solutions on the cloud.

Cloud AIComputer VisionOCR
0 likes · 10 min read
Inside Tencent Cloud OCR: Architecture, Performance, and Integration Guide
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 18, 2019 · Artificial Intelligence

Mastering Video Object Segmentation: 3 Research Paths & Alibaba’s Latest Advances

This article explains video object segmentation, outlines the three main research directions—semi‑supervised, interactive, and unsupervised—describes Alibaba’s Moku Lab breakthroughs and competition results, and discusses future plans to improve segmentation in complex scenes.

Alibaba ResearchComputer Visioninteractive segmentation
0 likes · 12 min read
Mastering Video Object Segmentation: 3 Research Paths & Alibaba’s Latest Advances
Xianyu Technology
Xianyu Technology
Sep 12, 2019 · Artificial Intelligence

Deep Learning for Automated Module Detection in Taobao 99 Promotion Pages

This study presents a deep‑learning pipeline that employs a Cascade‑RCNN with Feature Pyramid Network to automatically detect and refine modules and their internal elements on Taobao’s 99‑promotion pages, achieving roughly 98 % precision and recall on a thousand‑image validation set and paving the way for broader e‑commerce event applications.

Cascade R-CNNComputer VisionDeep Learning
0 likes · 7 min read
Deep Learning for Automated Module Detection in Taobao 99 Promotion Pages
Youku Technology
Youku Technology
Aug 19, 2019 · Artificial Intelligence

Alibaba Showcases AI Innovations in Entertainment and Security at IJCAI 2019

At IJCAI 2019, Alibaba’s MoKu Lab unveiled the Beidou Star platform and an intelligent conversational video search system for end‑to‑end content creation, while its Turing Lab demonstrated security AI such as Green Net, IP Brain, facial‑recognition and Tianyan, complemented by multiple research papers, academic collaborations and new hiring drives.

AIAlibabaComputer Vision
0 likes · 11 min read
Alibaba Showcases AI Innovations in Entertainment and Security at IJCAI 2019
Youku Technology
Youku Technology
Aug 14, 2019 · Artificial Intelligence

Technical Analysis of “Chang'an” – The Beidou Star System for Reducing Content Uncertainty and Boosting Hit Potential

The talk details how Youku’s Beidou Star AI platform deconstructs the drama “Chang’an Twelve Hours” with NLP, computer‑vision, knowledge graphs and multi‑task deep models to quantify script, character and emotion uncertainty, enabling predictive scoring that lifted the series’ daily index above one million and outlines future hybrid decision‑engine research.

AIComputer VisionContent Analytics
0 likes · 12 min read
Technical Analysis of “Chang'an” – The Beidou Star System for Reducing Content Uncertainty and Boosting Hit Potential
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 8, 2019 · Artificial Intelligence

Alibaba VOS Innovations: Semi-supervised, Interactive & Unsupervised Segmentation

Video Object Segmentation (VOS) is essential for content creation, and Alibaba’s research outlines three main approaches—semi-supervised, interactive, and unsupervised—detailing their algorithms, challenges, evaluation metrics, recent breakthroughs, and future plans to improve accuracy in complex scenes.

AIComputer Visioninteractive
0 likes · 12 min read
Alibaba VOS Innovations: Semi-supervised, Interactive & Unsupervised Segmentation
Youku Technology
Youku Technology
Jul 31, 2019 · Artificial Intelligence

Exploring the Three Key Research Directions in Video Object Segmentation

The article outlines video object segmentation (VOS), its importance for content creation, and details the three primary research avenues—semi‑supervised, interactive, and unsupervised—while reviewing benchmark metrics, algorithm categories, challenges, and recent advances from Alibaba’s MoKu Lab, including their competition results and future plans.

AIComputer Visioninteractive
0 likes · 14 min read
Exploring the Three Key Research Directions in Video Object Segmentation
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 26, 2019 · Artificial Intelligence

Preface

In the 2019 iQIYI Celebrity Video Identification Challenge, our team secured fifth place by accurately recognizing video identities using mAP scoring, and this article shares the strategies, insights, and experiences of the top‑five teams, emphasizing a straightforward, pragmatic approach championed by iQIYI’s technology product team.

AIComputer VisionTechnical Report
0 likes · 5 min read
Preface
Amap Tech
Amap Tech
Jul 23, 2019 · Artificial Intelligence

Traffic Sign Detection in Gaode Maps: Machine Learning Techniques and System Architecture

Gaode Maps uses a two-stage machine‑learning pipeline (Faster‑RCNN with shape‑based region proposal networks and fine‑grained classifiers) to detect hundreds of traffic‑sign types in billions of street‑view images, achieving high recall and precision, scalable updates, and near‑real‑time map data refresh.

AIComputer VisionDeep Learning
0 likes · 11 min read
Traffic Sign Detection in Gaode Maps: Machine Learning Techniques and System Architecture
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 5, 2019 · Artificial Intelligence

iQIYI Multimodal Person Recognition Competition: 91.14% Accuracy Achieved by BUPT Team

After a three‑month contest co‑hosted by iQIYI and ACM MM, 255 teams competed on the challenging iQIYI‑VID‑2019 multimodal dataset, and the BUPT Automation School team won with a 91.14% person‑recognition accuracy, advancing the field and enhancing iQIYI’s video recommendation and AI services.

AI competitionComputer VisionDataset
0 likes · 6 min read
iQIYI Multimodal Person Recognition Competition: 91.14% Accuracy Achieved by BUPT Team
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 28, 2019 · Artificial Intelligence

Alibaba AI Wins Visual Dialogue Challenge with New Recursive Model

In the second Visual Dialogue Challenge, Alibaba’s AI outperformed ten teams—including Microsoft and Seoul University—achieving a 74.57% accuracy, surpassing the previous record by 16.82% and exceeding human performance, thanks to its novel recursive exploration dialogue model that integrates image recognition, relational reasoning, and natural language understanding.

AIComputer Visionnatural language processing
0 likes · 4 min read
Alibaba AI Wins Visual Dialogue Challenge with New Recursive Model
iQIYI Technical Product Team
iQIYI Technical Product Team
May 30, 2019 · Mobile Development

SmileAR: iQIYI’s Mobile AR Solution Powered by TensorFlow Lite

SmileAR, iQIYI’s self‑developed mobile AR platform powered by TensorFlow Lite, delivers real‑time face, body and gesture recognition across iQIYI’s apps through MobileNet‑based models, quantization‑aware training, multi‑task learning and encrypted SDKs, achieving fast, lightweight, cross‑platform AR experiences for millions of users.

ARComputer VisionMobile AI
0 likes · 10 min read
SmileAR: iQIYI’s Mobile AR Solution Powered by TensorFlow Lite
Youku Technology
Youku Technology
May 29, 2019 · Artificial Intelligence

Youku Video Enhancement and Super-Resolution Competition Announcement

The Youku Video Enhancement and Super‑Resolution Challenge invites teams to develop models that restore low‑resolution, noisy video to high‑definition quality using a 10,000‑pair industry dataset, offering up to RMB 100,000 in prizes and a recruitment pathway, with registration open through June 16 and competition phases spanning May to September.

AI competitionComputer VisionDeep Learning
0 likes · 10 min read
Youku Video Enhancement and Super-Resolution Competition Announcement
Youku Technology
Youku Technology
May 20, 2019 · Artificial Intelligence

Youku Video Enhancement and Super‑Resolution Competition Overview

The Youku Video Enhancement and Super‑Resolution Competition challenges teams of up to five to develop 4× upscaling models that also remove noise and compression artifacts, using a 10,000‑pair dataset, with prizes up to ¥100,000 and recruitment opportunities, running from May to September 2019.

AI competitionComputer VisionDeep Learning
0 likes · 9 min read
Youku Video Enhancement and Super‑Resolution Competition Overview
DataFunTalk
DataFunTalk
May 14, 2019 · Artificial Intelligence

A Comprehensive Overview of Image Search Technology: Frameworks, Evolution, and System Architecture

This article provides a thorough introduction to image‑search technology, covering its general framework, offline and online components, feature‑extraction evolution, retrieval engine structures, and architectural challenges such as dynamic indexing, feature synchronization, and high‑throughput low‑latency serving.

Computer Visionfeature extractionimage search
0 likes · 12 min read
A Comprehensive Overview of Image Search Technology: Frameworks, Evolution, and System Architecture
Youku Technology
Youku Technology
May 13, 2019 · Artificial Intelligence

How Youku Tackles Multimodal Video Understanding and Quality Control

This article outlines Youku's multimodal video content understanding pipeline, covering business needs, problem decomposition, data construction, model selection, OCR subtitle extraction, scene and action recognition, sample augmentation, noise handling, and multimodal fusion strategies for robust content moderation.

AIComputer VisionOCR
0 likes · 11 min read
How Youku Tackles Multimodal Video Understanding and Quality Control
DataFunTalk
DataFunTalk
May 8, 2019 · Artificial Intelligence

Perception System Overview: Sensors, Fusion, Onboard Architecture, and Technical Challenges in Autonomous Driving

This article presents a comprehensive overview of autonomous driving perception, covering system fundamentals, sensor setups and fusion techniques, onboard processing architecture, and the key technical challenges such as precision‑recall balance, adverse weather, and small‑object detection.

Computer VisionSensor Fusionautonomous driving
0 likes · 12 min read
Perception System Overview: Sensors, Fusion, Onboard Architecture, and Technical Challenges in Autonomous Driving
Youku Technology
Youku Technology
May 6, 2019 · Artificial Intelligence

Exploring Intelligent Production at Youku: AI‑Driven Video Analysis and Automation

The talk describes Youku’s intelligent production platform, which uses AI and cloud computing to automatically analyze video frames, extract fine‑grained metadata such as scenes, persons, actions and scores, and then generate highlights, vertical clips, annotations and feedback for editors and upstream producers, while addressing challenges like pose‑tracking, graph‑based action classification and future plans for deeper video understanding and open competitions.

AIComputer Visionimage search
0 likes · 14 min read
Exploring Intelligent Production at Youku: AI‑Driven Video Analysis and Automation
Youku Technology
Youku Technology
Apr 29, 2019 · Artificial Intelligence

Precise and Fast Object Segmentation Algorithms – Talk by Ren Haibing (Youku Cognitive Lab)

Ren Haibing’s Youku Cognitive Lab talk reviews object segmentation’s motivation, explains semantic and instance concepts, presents UNet‑based and category‑agnostic methods—including fast video segmentation with motion cues—and reports high IoU results while outlining future edge‑aware, label‑free, and non‑online video segmentation research directions.

AIComputer VisionDeep Learning
0 likes · 19 min read
Precise and Fast Object Segmentation Algorithms – Talk by Ren Haibing (Youku Cognitive Lab)
NetEase Media Technology Team
NetEase Media Technology Team
Apr 26, 2019 · Artificial Intelligence

Intelligent Cover Image Selection System for News Articles: Image Quality Assessment and Smart Cropping

The article describes an intelligent cover‑image selection system for NetEase News that automatically filters unsuitable illustrations, assesses image quality with a pairwise‑trained deep model across clarity, color and composition, and smartly crops images using aspect‑ratio‑aware object detection, dramatically cutting manual editing and enabling confidence‑based automatic publishing.

Computer VisionImage CroppingNeural Network
0 likes · 11 min read
Intelligent Cover Image Selection System for News Articles: Image Quality Assessment and Smart Cropping
Tencent Cloud Developer
Tencent Cloud Developer
Apr 19, 2019 · Artificial Intelligence

Tencent Cloud Face Recognition Technology: Products, Architecture, and Industry Applications

The article outlines Tencent Cloud’s face‑recognition technology—from its deep‑learning‑based algorithm training and multi‑layer system architecture, through the YouTu Lab‑powered product suite for detection, analysis, comparison, liveness and search, to real‑world deployments in security, metro transportation and retail, highlighting integration challenges and performance optimizations.

AI productsComputer VisionSmart Retail
0 likes · 18 min read
Tencent Cloud Face Recognition Technology: Products, Architecture, and Industry Applications
HomeTech
HomeTech
Apr 18, 2019 · Artificial Intelligence

An Overview of Image Processing Techniques and Common Tools for Beginners

This article provides a concise introduction to image processing, covering its hierarchical structure, fundamental techniques such as classification, detection, segmentation, geometric transformation, and the most widely used libraries and deep‑learning frameworks for newcomers.

Computer VisionImage ClassificationImage Processing
0 likes · 9 min read
An Overview of Image Processing Techniques and Common Tools for Beginners
Tencent Cloud Developer
Tencent Cloud Developer
Apr 16, 2019 · Artificial Intelligence

Building Image Recognition Systems: From Basics to Advanced AI Techniques

This article summarizes a computer‑vision salon where Dr. Ji Yongnan explains imaging pipelines, traditional feature‑based methods, deep‑learning breakthroughs, Tencent Cloud AI services, real‑world case studies, and answers audience questions about machine‑vision versus computer‑vision and data‑scarcity challenges.

AI applicationsComputer VisionDeep Learning
0 likes · 18 min read
Building Image Recognition Systems: From Basics to Advanced AI Techniques
Youku Technology
Youku Technology
Apr 11, 2019 · Artificial Intelligence

YOUKU-VSRE 2019 Video Enhancement and Super-Resolution Challenge Announcement

The YOUKU‑VSRE 2019 challenge invites researchers to develop state‑of‑the‑art video enhancement and super‑resolution models using the largest, most diverse simulated‑noise dataset, with three competition stages (preliminary, semi‑final, final), cash prizes up to ¥100,000, certificates, and fast‑track recruitment opportunities at Alibaba (Youku).

AI challengeComputer VisionDataset
0 likes · 3 min read
YOUKU-VSRE 2019 Video Enhancement and Super-Resolution Challenge Announcement
Didi Tech
Didi Tech
Mar 28, 2019 · Artificial Intelligence

Overview of the CVPR 2019 WAD Autonomous Driving Challenge and Participation Details

The CVPR 2019 WAD Autonomous Driving Challenge, hosted in Long Beach, introduces four new tasks—including object‑detection and tracking transfer‑learning tracks using Didi’s massive D²‑City and Berkeley’s BDD100K datasets, plus a large‑scale detection interpolation track—aimed at advancing vision algorithms under diverse, difficult driving conditions, with global teams invited to register by May 31 and winners announced at the workshop on June 17.

AIChallengeComputer Vision
0 likes · 6 min read
Overview of the CVPR 2019 WAD Autonomous Driving Challenge and Participation Details
Beike Product & Technology
Beike Product & Technology
Mar 21, 2019 · Artificial Intelligence

Optimization Foundations and Applications in Machine Learning and Computer Vision

This article introduces how machine learning problems are formulated as optimization tasks, explains the construction of objective functions with examples such as linear regression, robust fitting, regularization, and demonstrates various applications ranging from K‑means clustering to image inpainting and 3D reconstruction.

Computer VisionRegularizationlinear regression
0 likes · 9 min read
Optimization Foundations and Applications in Machine Learning and Computer Vision
DataFunTalk
DataFunTalk
Mar 15, 2019 · Artificial Intelligence

A Comprehensive Overview of Deep Learning Applications in Computer Vision

This article provides an extensive review of deep learning techniques applied to computer vision, covering the evolution of CNN architectures, image and video processing tasks, 2.5‑D and 3‑D reconstruction, object detection, segmentation, tracking, SLAM, and various practical applications such as AR, content retrieval, and autonomous driving.

CNNComputer VisionImage Processing
0 likes · 22 min read
A Comprehensive Overview of Deep Learning Applications in Computer Vision
System Architect Go
System Architect Go
Mar 14, 2019 · Artificial Intelligence

Understanding Image Similarity: Image Hashing and Feature-Based Methods

This article explains why simple MD5 checks cannot assess image similarity and introduces two major approaches—image hashing and image feature extraction—detailing their algorithms, practical performance, and how to compare images efficiently using Hamming distance and indexing techniques.

Computer VisionHamming distancefeature extraction
0 likes · 7 min read
Understanding Image Similarity: Image Hashing and Feature-Based Methods
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 12, 2019 · Artificial Intelligence

How AI and RFID Combine to Track Customer‑Product Interactions in Retail

This article presents a comprehensive AI‑driven framework that fuses video‑based customer action detection, RFID‑based product flip detection, and bipartite graph matching to accurately determine when, where, and which customer interacts with which SKU in a retail environment, discussing algorithms, optimizations, and experimental results.

AIComputer VisionCustomer Behavior
0 likes · 22 min read
How AI and RFID Combine to Track Customer‑Product Interactions in Retail
JD Tech
JD Tech
Mar 8, 2019 · Artificial Intelligence

Integrated Engineering & Algorithm Platform for AI Visual Applications

This article describes a comprehensive, end‑to‑end AI visual algorithm platform that unifies data collection, annotation, model training, deployment, testing, quality evaluation, and service gateways, illustrating how such integration improves transparency, efficiency, and quality across use cases like background removal, face swapping, and clothing recommendation.

AIAlgorithm PlatformClothing Recommendation
0 likes · 13 min read
Integrated Engineering & Algorithm Platform for AI Visual Applications
Hulu Beijing
Hulu Beijing
Mar 7, 2019 · Artificial Intelligence

From AlexNet to ResNeXt: Key Milestones in CNN Evolution

This article traces the evolution of convolutional neural networks from the pioneering AlexNet through VGG, Inception, ResNet, Inception‑v4, Inception‑ResNet and ResNeXt, highlighting architectural innovations, performance gains, and the underlying biological inspirations that shaped modern deep learning models.

AlexNetCNNComputer Vision
0 likes · 13 min read
From AlexNet to ResNeXt: Key Milestones in CNN Evolution
21CTO
21CTO
Mar 4, 2019 · Artificial Intelligence

How to Spot AI‑Generated Fake Faces: Tips, Tricks, and the Tech Behind StyleGAN

This article explains why AI‑generated faces from StyleGAN are hard to distinguish, introduces an online game for testing realism, and provides practical visual cues—such as water spots, background errors, asymmetric glasses, hair artifacts, and teeth anomalies—to reliably identify fake images.

AI-generated imagesComputer VisionFace Detection
0 likes · 8 min read
How to Spot AI‑Generated Fake Faces: Tips, Tricks, and the Tech Behind StyleGAN
Ctrip Technology
Ctrip Technology
Feb 28, 2019 · Artificial Intelligence

OCR Techniques and Solutions for Ctrip Business: Deep Learning Based Text Detection and Recognition

This article presents an overview of computer‑vision based OCR in Ctrip's operations, detailing deep‑learning text detection methods for controlled and uncontrolled scenarios, sequence‑based recognition models, training strategies with synthetic data, and performance results, while discussing current challenges and future improvements.

AIComputer VisionCtrip
0 likes · 11 min read
OCR Techniques and Solutions for Ctrip Business: Deep Learning Based Text Detection and Recognition
Xianyu Technology
Xianyu Technology
Feb 27, 2019 · Artificial Intelligence

UI2CODE: Layout Analysis and Background/Foreground Extraction for UI Images

The UI2CODE system tackles UI layout analysis by first extracting backgrounds with Sobel, Laplacian and Canny edge detection plus a flood‑fill algorithm, then isolating foreground components through connected‑component analysis and a Faster R‑CNN classifier, and finally fusing both pipelines to achieve superior precision, recall and IoU on Xianyu app screenshots.

Computer VisionDeep LearningFaster R-CNN
0 likes · 16 min read
UI2CODE: Layout Analysis and Background/Foreground Extraction for UI Images
System Architect Go
System Architect Go
Feb 26, 2019 · Fundamentals

Master the Basics of Image Processing with OpenCV and NumPy

This article introduces core image processing concepts—pixel fundamentals, binary, grayscale, and RGB images, matrix representation—and demonstrates practical implementations of cropping, canvas creation, watermarking, translation, rotation, and scaling using Python's OpenCV and NumPy libraries, including algorithm choices for resizing.

Computer VisionImage ProcessingNumPy
0 likes · 5 min read
Master the Basics of Image Processing with OpenCV and NumPy
ITPUB
ITPUB
Feb 23, 2019 · Artificial Intelligence

Explore a 1.59 Million Image NSFW Dataset with 159 Fine-Grained Categories

A data scientist from Besedo has open‑sourced a massive NSFW image dataset containing 1.589 million pictures, organized into 159 primary categories and further sub‑categories, with download scripts and GitHub links, requiring about 500 GB of storage and cautioning against viewing in the office.

AI researchComputer VisionGitHub
0 likes · 3 min read
Explore a 1.59 Million Image NSFW Dataset with 159 Fine-Grained Categories
21CTO
21CTO
Feb 22, 2019 · Fundamentals

Why the Iconic “Lenna” Photo Became the Face of Image‑Processing Research

The article recounts how a 1960 Playboy portrait of Lena Söderberg was adopted by image‑processing researchers as a standard test image, explains the technical and cultural reasons for its lasting popularity, and follows her unexpected rise to fame within the scientific community.

BenchmarkComputer VisionImage Processing
0 likes · 7 min read
Why the Iconic “Lenna” Photo Became the Face of Image‑Processing Research
ITPUB
ITPUB
Feb 16, 2019 · Artificial Intelligence

A 1.59 Million‑Image NSFW Dataset Released for Advanced Content Filtering

Data scientist Evgeny Bazarov has open‑sourced a 1.589 million‑image NSFW dataset organized into 159 fine‑grained categories, providing GitHub links, download scripts, and a 500 GB storage requirement, enabling researchers to build more precise adult‑content detection models.

Computer VisionGitHubImage Classification
0 likes · 3 min read
A 1.59 Million‑Image NSFW Dataset Released for Advanced Content Filtering
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 12, 2019 · Artificial Intelligence

Essential AI Research Highlights to Jump‑Start Your Post‑Holiday Learning

After the Chinese New Year break, this curated collection of key AI articles—spanning computer vision, speech recognition, natural language processing, recommendation systems, and more—helps technical readers quickly regain momentum in work and study by revisiting core technologies with real‑world case studies.

AIComputer Visionspeech recognition
0 likes · 6 min read
Essential AI Research Highlights to Jump‑Start Your Post‑Holiday Learning
21CTO
21CTO
Feb 7, 2019 · Artificial Intelligence

How to Build a Real‑Time Parking Spot Detector with Mask R‑CNN and Python

This tutorial walks through using a webcam, Mask R‑CNN, and Python to automatically detect available parking spaces, track stationary vehicles, compute Intersection‑over‑Union to confirm emptiness, and send SMS alerts via Twilio, providing full code snippets and practical tips.

Computer VisionIoUMask R-CNN
0 likes · 16 min read
How to Build a Real‑Time Parking Spot Detector with Mask R‑CNN and Python
JD Tech
JD Tech
Jan 30, 2019 · Artificial Intelligence

JD AI Presents Eight Papers at AAAI 2019 Showcasing Advances in Machine Learning, NLP, and Computer Vision

At AAAI 2019 in Hawaii, JD AI Research Institute had eight papers accepted covering machine learning, natural language processing, computer vision, and multimodal AI, highlighting innovations such as AutoZOOM black‑box attacks, SACN for knowledge base completion, and temporally aware video captioning models.

Computer VisionMultimodal Learningartificial intelligence
0 likes · 11 min read
JD AI Presents Eight Papers at AAAI 2019 Showcasing Advances in Machine Learning, NLP, and Computer Vision
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 29, 2019 · Artificial Intelligence

Alibaba's AI-Driven In-Store Foot Traffic Digitization

Alibaba’s search division showcases how AI transforms traditional retail by digitizing in‑store foot traffic, employing camera‑based person detection, re‑identification, RFID‑enhanced product interaction, and edge‑optimized models to generate real‑time customer insights, heatmaps, and personalized recommendations that bridge offline and online shopping experiences.

AIComputer VisionRFID
0 likes · 25 min read
Alibaba's AI-Driven In-Store Foot Traffic Digitization
ITPUB
ITPUB
Jan 27, 2019 · Artificial Intelligence

Achieve 99% Accurate Face Recognition with Python’s face_recognition Library

This guide introduces the open‑source Python library face_recognition, explains its high‑accuracy (up to 99.38%) facial detection and landmark capabilities, provides step‑by‑step code examples for locating faces, extracting landmarks, and comparing identities, and lists practical use‑case scenarios and the GitHub repository.

Computer VisionGitHubPython
0 likes · 6 min read
Achieve 99% Accurate Face Recognition with Python’s face_recognition Library
DataFunTalk
DataFunTalk
Jan 14, 2019 · Artificial Intelligence

Computer Vision Fundamentals, Traditional Methods, Deep Learning Advances, and Cloud AI Deployment

This article provides a comprehensive overview of computer vision, covering its basic concepts, traditional image processing techniques, modern deep‑learning approaches, real‑world AI application cases, and the cloud infrastructure needed to support large‑scale deployment, while also offering skill‑advancement guidance.

AI applicationsCloud AIComputer Vision
0 likes · 20 min read
Computer Vision Fundamentals, Traditional Methods, Deep Learning Advances, and Cloud AI Deployment
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 8, 2019 · Artificial Intelligence

How Alibaba Digitizes In‑Store Foot Traffic with AI and RFID Fusion

This article details Alibaba's end‑to‑end solution for digitizing offline retail foot traffic, combining existing surveillance cameras, RFID tags, and advanced AI techniques such as lightweight YOLO detection, knowledge distillation, and multi‑level pedestrian re‑identification to capture, analyze, and act on shopper behavior for both business operations and personalized in‑store experiences.

AIComputer VisionEdge Computing
0 likes · 27 min read
How Alibaba Digitizes In‑Store Foot Traffic with AI and RFID Fusion
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 7, 2019 · Artificial Intelligence

What Are Alibaba DAMO Academy’s 2019 Top 10 Tech Trends and Their Real-World Impact?

This week’s Alibaba tech roundup highlights the DAMO Academy’s 2019 top‑10 technology trends—from smart cities and AI chips to blockchain and 5G—plus breakthrough AI liver‑tumor segmentation results, the open‑source Fusion design system, a Flink Forward China recap, a new computer‑vision paper collection, and an upcoming Apache Dubbo live session.

Computer Visionopen source
0 likes · 9 min read
What Are Alibaba DAMO Academy’s 2019 Top 10 Tech Trends and Their Real-World Impact?
DataFunTalk
DataFunTalk
Dec 20, 2018 · Artificial Intelligence

How to Build World-Class Visual AI Technology

This presentation outlines the fundamentals of computer vision, discusses key factors such as algorithm research, large‑scale training platforms, intelligent data processing, and hardware optimization, and shares practical experiences from DeepGlint on building a world‑class visual AI system and its real‑world applications.

Computer VisionHardware Optimizationdata pipeline
0 likes · 23 min read
How to Build World-Class Visual AI Technology
Tencent Cloud Developer
Tencent Cloud Developer
Dec 17, 2018 · Artificial Intelligence

An Overview of Computer Vision: Fundamentals, Traditional Techniques, and Deep Learning Applications

The talk provides a comprehensive overview of computer vision, defining its scope, detailing low‑, mid‑, and high‑level processing pipelines, reviewing classic filters and feature extractors, explaining deep‑learning breakthroughs such as CNNs and YOLO, and showcasing Tencent Cloud AI services, career paths, and learning resources.

AIComputer Visionmachine learning
0 likes · 43 min read
An Overview of Computer Vision: Fundamentals, Traditional Techniques, and Deep Learning Applications
360 Quality & Efficiency
360 Quality & Efficiency
Dec 7, 2018 · Artificial Intelligence

Image Feature Extraction and Clustering for Key Frame Selection in Mobile App Installation Screenshots

This article presents a technical solution for extracting representative key frames from time‑series screenshots of a mobile app installation process, covering pixel sampling, dimensionality reduction, classic feature extractors (SIFT, HOG, ORB), auto‑encoder based deep learning, and clustering methods such as KMeans and DBSCAN, along with practical results and performance analysis.

AutoencoderComputer VisionHOG
0 likes · 5 min read
Image Feature Extraction and Clustering for Key Frame Selection in Mobile App Installation Screenshots
Tencent Cloud Developer
Tencent Cloud Developer
Dec 5, 2018 · Artificial Intelligence

19 AI Technologies That Are Currently Dominating

The article surveys the nineteen leading AI technologies—from natural language generation and speech recognition to digital twins and marketing automation—detailing their core functions, common use cases such as customer service, security, content creation, and the key vendors delivering each solution.

AI TechnologiesComputer VisionDeep Learning
0 likes · 17 min read
19 AI Technologies That Are Currently Dominating
21CTO
21CTO
Nov 21, 2018 · Artificial Intelligence

What’s Driving the Rapid Evolution of Face Recognition Technology?

This comprehensive overview examines the fundamentals, historical milestones, key algorithms, major datasets, policy support, industry applications, and future trends of face recognition technology, highlighting its rapid growth within computer vision and artificial intelligence.

AIBiometricsComputer Vision
0 likes · 45 min read
What’s Driving the Rapid Evolution of Face Recognition Technology?
Xianyu Technology
Xianyu Technology
Nov 20, 2018 · Artificial Intelligence

How to Separate Complex Image Foreground from Background Using AI and Classic CV Techniques

This article presents a step‑by‑step solution that combines computer‑vision preprocessing, OCR, CNN classification, shape matching, and inpainting to isolate meaningful foreground elements from images with intricate backgrounds, discussing practical results, limitations, and code implementations.

Computer VisionDeep LearningOpenCV
0 likes · 15 min read
How to Separate Complex Image Foreground from Background Using AI and Classic CV Techniques
MaGe Linux Operations
MaGe Linux Operations
Nov 16, 2018 · Artificial Intelligence

Real-Time Object Detection with OpenCV, Python, and Deep Learning

This tutorial walks through extending a deep‑learning object detector to process live video streams using OpenCV and Python, covering setup, command‑line arguments, model loading, frame‑by‑frame detection, drawing bounding boxes, FPS measurement, and performance tips.

Computer VisionVideo Streamobject detection
0 likes · 9 min read
Real-Time Object Detection with OpenCV, Python, and Deep Learning
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 19, 2018 · Artificial Intelligence

How Alibaba’s AI‑Powered “Future Store” Redefines Unmanned Retail

Alibaba’s senior tech expert explains the concept, architecture, core AI capabilities, real‑world case studies, and future roadmap of the Tmall “Future Store”, a vision‑driven, sensor‑rich unmanned retail experience that merges computer‑vision, edge computing, and data‑driven operations.

AIAlibabaComputer Vision
0 likes · 17 min read
How Alibaba’s AI‑Powered “Future Store” Redefines Unmanned Retail
Tencent Cloud Developer
Tencent Cloud Developer
Oct 12, 2018 · Artificial Intelligence

Understanding Convolutional Neural Networks (CNN) with Keras

The article introduces convolutional neural networks, explains core concepts such as convolution, padding, stride, and pooling, demonstrates how to calculate output dimensions, and provides a step‑by‑step Keras example that builds, compiles, and trains a multi‑layer CNN for image classification.

CNNComputer VisionDeep Learning
0 likes · 8 min read
Understanding Convolutional Neural Networks (CNN) with Keras
Architects Research Society
Architects Research Society
Oct 7, 2018 · Artificial Intelligence

The Rise of Deep Neural Networks: From Research Breakthroughs to Industry Adoption

Deep neural networks, propelled by breakthroughs such as AlexNet and advances in GPU and TPU hardware, are rapidly moving from academic research into diverse applications—including earthquake prediction, medical imaging, and autonomous driving—driving massive industry investment, new semiconductor designs, and intense competition among tech giants and startups.

AI hardwareComputer VisionGPU
0 likes · 9 min read
The Rise of Deep Neural Networks: From Research Breakthroughs to Industry Adoption
21CTO
21CTO
Sep 14, 2018 · Artificial Intelligence

From Stanford to Google: How Fei‑Fei Li Built ImageNet and Shaped AI

Fei‑Fei Li, the pioneering AI researcher and former Google Cloud AI lead, rose from humble beginnings in China to create the ImageNet dataset, drive breakthroughs in computer vision, and now returns to Stanford, illustrating how curiosity and perseverance can transform both academia and industry.

Computer VisionFei-Fei LiGoogle AI
0 likes · 12 min read
From Stanford to Google: How Fei‑Fei Li Built ImageNet and Shaped AI
Qunar Tech Salon
Qunar Tech Salon
Sep 11, 2018 · Artificial Intelligence

Overview of Deep Learning Object Detection Methods and Detailed Implementation of Faster R‑CNN

This article reviews major deep‑learning object detection approaches—including one‑stage YOLO and SSD and two‑stage RCNN, Fast RCNN, and Faster RCNN—then provides a step‑by‑step explanation of Faster RCNN’s architecture, region‑proposal network, RoI pooling, loss functions, and sample PyTorch code.

Computer VisionFaster R-CNNPyTorch
0 likes · 20 min read
Overview of Deep Learning Object Detection Methods and Detailed Implementation of Faster R‑CNN
JavaScript
JavaScript
Sep 8, 2018 · Artificial Intelligence

How Sketch2Code Turns Hand‑Drawn UI Designs into Ready‑to‑Use HTML with AI

Sketch2Code leverages Microsoft’s custom vision model, OCR, and Azure services to automatically convert hand‑drawn UI mockups into functional HTML code, detailing its workflow—from image upload and element prediction to layout generation and final HTML output—plus links to the repository and demo site.

AIAzureComputer Vision
0 likes · 3 min read
How Sketch2Code Turns Hand‑Drawn UI Designs into Ready‑to‑Use HTML with AI
MaGe Linux Operations
MaGe Linux Operations
Aug 21, 2018 · Artificial Intelligence

How Deep Learning Transformed Face Recognition: From Images to Real‑Time Video

This article surveys the evolution of face recognition from early statistical methods to modern deep‑learning approaches, outlines key researchers, open‑source projects, popular APIs, core processing steps, the DeepFace architecture, datasets, and experimental results, providing a comprehensive guide for practitioners and researchers.

CNNComputer VisionDatasets
0 likes · 22 min read
How Deep Learning Transformed Face Recognition: From Images to Real‑Time Video
iQIYI Technical Product Team
iQIYI Technical Product Team
Aug 10, 2018 · Artificial Intelligence

iQIYI Releases World's First Multimodal, Multi-angle Celebrity Video Dataset (iQIYI-VID) and Announces AI Competition

iQIYI released iQIYI-VID, the world’s first multimodal, multi-angle celebrity video dataset (1,000 hours, 500,000 clips, 5,000 celebrities) for a new AI competition focusing on multimodal video person recognition, which has attracted global university teams and top computer‑vision judges to advance AI understanding in entertainment.

AI datasetComputer Visioncompetition
0 likes · 7 min read
iQIYI Releases World's First Multimodal, Multi-angle Celebrity Video Dataset (iQIYI-VID) and Announces AI Competition
HomeTech
HomeTech
Aug 7, 2018 · Artificial Intelligence

Overview of Object Detection Algorithms: Two‑Stage and One‑Stage Methods

This article reviews the evolution of visual object detection, explaining traditional region‑based approaches, the rise of deep‑learning two‑stage frameworks such as R‑CNN, Fast R‑CNN and Faster R‑CNN, and the faster one‑stage models like Overfeat, YOLO, SSD and RetinaNet, together with their design choices, training strategies and loss functions.

Computer VisionR-CNNSSD
0 likes · 17 min read
Overview of Object Detection Algorithms: Two‑Stage and One‑Stage Methods
Tencent Cloud Developer
Tencent Cloud Developer
Aug 6, 2018 · Artificial Intelligence

Tencent's AI Breast Cancer Screening System: Technical Architecture and Implementation

Tencent's AI Breast System combines mammography, pathology, MRI and ultrasound analysis using a multi‑scale, progressive TMuNet model that processes four views, learns from physician feedback, and delivers lesion localization, malignancy scoring and automated reports, achieving up to 92% sensitivity and reducing annotation time.

AI Medical ImagingBreast Cancer DetectionComputer Vision
0 likes · 13 min read
Tencent's AI Breast Cancer Screening System: Technical Architecture and Implementation