Tagged articles

650 articles

Page 4 of 7

Jul 18, 2022 · Artificial Intelligence

Trusted AI Research at Ant Group: Advances in Computer Vision, Watermark Defense, Robust Machine Learning, and Explainable NLG

Ant Group’s security labs present a series of cutting‑edge AI research achievements—including hierarchical multi‑granular classification for computer vision, watermark‑vaccine defenses, multi‑modal document understanding, robust and explainable machine learning, and logic‑driven data‑to‑text generation—highlighting their commitment to trustworthy and secure AI applications.

AI SafetyComputer VisionData2Text

0 likes · 12 min read

Trusted AI Research at Ant Group: Advances in Computer Vision, Watermark Defense, Robust Machine Learning, and Explainable NLG

JD Tech

Jul 18, 2022 · Artificial Intelligence

AI-Powered Visual Defect Detection for Mobile App UI Testing: Methodology, Data Construction, Model Training, and Evaluation

This article presents an end‑to‑end AI‑driven visual testing solution for mobile applications, detailing the business pain points, data set construction, CNN‑based model design, training procedures, performance evaluation with ROC and confusion matrices, and future directions for improving defect detection accuracy.

Computer VisionDeep LearningImage Classification

0 likes · 14 min read

AI-Powered Visual Defect Detection for Mobile App UI Testing: Methodology, Data Construction, Model Training, and Evaluation

MaGe Linux Operations

Jul 14, 2022 · Artificial Intelligence

How to Detect Nude Images with Python and Pillow: A Complete Guide

This article walks through building a Python3 program that uses the Pillow library to identify skin regions in images, applies color‑space heuristics to classify pixels, merges connected skin areas, and decides whether an image is pornographic based on configurable rules, complete with code samples and testing results.

Computer VisionImage ProcessingPython

0 likes · 22 min read

How to Detect Nude Images with Python and Pillow: A Complete Guide

Youku Technology

Jul 14, 2022 · Artificial Intelligence

Predicting Visual Saliency in Augmented Reality: The SARD Dataset and VQSal‑AR Model

This article introduces the SARD dataset of background and AR images, describes a large‑scale eye‑tracking study with 60 participants, and presents the VQSal‑AR vector‑quantization model that outperforms baseline methods in predicting visual saliency for augmented reality scenes.

Computer VisionDatasetVQSal-AR

0 likes · 4 min read

Predicting Visual Saliency in Augmented Reality: The SARD Dataset and VQSal‑AR Model

58 Tech

Jul 14, 2022 · Artificial Intelligence

Image Quality Assessment Techniques and Their Application in 58.com Recruitment Image Filtering

This article reviews image quality assessment (IQA) methods—including full‑reference, reduced‑reference, and no‑reference approaches—covers typical datasets and evaluation metrics, describes CNN‑based models such as WaDIQaM, DBCNN and hyperIQA, and details a customized IQA solution deployed at 58.com to filter and rank recruitment images, achieving a reduction of bad‑image rate from 9% to 0%.

CNNComputer VisionIQA

0 likes · 17 min read

Image Quality Assessment Techniques and Their Application in 58.com Recruitment Image Filtering

Alimama Tech

Jul 13, 2022 · Artificial Intelligence

Fully Automatic Template‑Free Image‑Text Creative Generation System

Alibaba Alimama’s fully automatic, template‑free image‑text creative generation system uses deep‑learning models across material mining, layout synthesis, on‑image copy generation, and visual attribute rendering to produce personalized ad creatives directly from product images and metadata, achieving roughly 19 % CTR lift over prior template‑based methods.

AIComputer VisionGenerative Models

0 likes · 19 min read

Fully Automatic Template‑Free Image‑Text Creative Generation System

DataFunTalk

Jul 12, 2022 · Artificial Intelligence

Applying Computer Vision for Content Safety in Live Streaming: Practices and Future Directions

This presentation details how Huya leverages computer‑vision algorithms to detect and mitigate risky content such as political, pornographic, and violent material in live‑streaming and short‑video platforms, describing system architecture, labeling strategies, algorithmic pipelines, real‑time moderation techniques, and future research directions.

AI SafetyComputer VisionRisk Detection

0 likes · 11 min read

Applying Computer Vision for Content Safety in Live Streaming: Practices and Future Directions

DaTaobao Tech

Jul 1, 2022 · Artificial Intelligence

Deep Generative Projection for High‑Fidelity Virtual Try‑On

The paper presents Deep Generative Projection (DGP), a virtual‑try‑on system that learns a realistic dressing distribution from unpaired images with StyleGAN, projects coarse garment‑human alignments into its latent space, refines details, and achieves higher fidelity and robustness than supervised SOTA methods without needing paired data.

Computer VisionUnsupervised Learninggenerative adversarial network

0 likes · 13 min read

Deep Generative Projection for High‑Fidelity Virtual Try‑On

DataFunTalk

Jun 30, 2022 · Artificial Intelligence

Self‑Augmented Unpaired Image Dehazing via Density and Depth Decomposition (D4)

The paper introduces D4, a self‑augmented unpaired image dehazing framework that decomposes the transmission map into fog density and scene depth, enabling realistic fog synthesis for data augmentation and achieving superior dehazing performance with fewer parameters and FLOPs on multiple benchmarks.

CVPR2022Computer VisionDepth estimation

0 likes · 14 min read

Self‑Augmented Unpaired Image Dehazing via Density and Depth Decomposition (D4)

AntTech

Jun 24, 2022 · Artificial Intelligence

Hierarchical Residual Network for Multi‑Granularity Classification (HRN) – CVPR 2022 Paper Overview

This article presents a CVPR 2022 paper by Zhejiang University and Ant Group that introduces a label‑relation‑tree‑based Hierarchical Residual Network (HRN) for improving multi‑granularity image classification, detailing its motivation, architecture, composite loss design, extensive experiments on fine‑grained datasets, and practical impact on content‑security applications.

CVPR2022Computer VisionDeep Learning

0 likes · 12 min read

Hierarchical Residual Network for Multi‑Granularity Classification (HRN) – CVPR 2022 Paper Overview

Meituan Technology Team

Jun 23, 2022 · Artificial Intelligence

Highlights of Six Meituan Papers Accepted at CVPR 2022

Meituan’s six CVPR 2022 papers advance computer vision by introducing a few‑sample model compression method, a language‑bridged video object segmentation approach, a single‑stage 3D visual grounding technique, a dynamic early‑exit image captioning system, a boosted black‑box adversarial attack, and a semi‑supervised video paragraph grounding framework.

3D groundingCVPR 2022Computer Vision

0 likes · 15 min read

Highlights of Six Meituan Papers Accepted at CVPR 2022

Xiaohongshu Tech REDtech

Jun 20, 2022 · Artificial Intelligence

Action Sequence Verification in Videos with CosAlignment Transformer (CAT)

The paper introduces Action Sequence Verification (ASV), a task that determines whether two videos follow the same ordered actions, provides the Chemical Sequence Verification dataset and re‑annotated COIN‑SV and Diving48‑SV sets, and proposes the CosAlignment Transformer (CAT) with intra‑step feature extraction, a Transformer‑based inter‑step encoder, and a sequence‑alignment loss that outperforms prior baselines and serves as a pre‑training model for video retrieval and classification.

Action VerificationComputer VisionDataset

0 likes · 7 min read

Action Sequence Verification in Videos with CosAlignment Transformer (CAT)

Xiaohongshu Tech REDtech

Jun 13, 2022 · Artificial Intelligence

Neighbor Transformer (NFormer): Robust Person Re-identification via Interactive Multi‑image Modeling

Neighbor Transformer (NFormer) introduces interactive multi‑image modeling for person re‑identification, using Landmark Agent Attention and Reciprocal Neighbor Softmax to efficiently fuse features across images, achieving state‑of‑the‑art accuracy and tighter embedding clusters on multiple benchmark datasets.

Computer VisionDeep Learninglandmark agent attention

0 likes · 8 min read

Neighbor Transformer (NFormer): Robust Person Re-identification via Interactive Multi‑image Modeling

Python Programming Learning Circle

Jun 10, 2022 · Artificial Intelligence

Hand Gesture Recognition Using OpenCV and Python: Video Capture, Skin Detection, and Contour Processing

This article demonstrates how to build a hand‑gesture detection system in Python using OpenCV, covering video capture, YCrCb‑based skin detection, contour extraction, and provides the complete source code for reproducing the results.

Computer VisionContour DetectionHand Gesture

0 likes · 6 min read

Hand Gesture Recognition Using OpenCV and Python: Video Capture, Skin Detection, and Contour Processing

DaTaobao Tech

Jun 10, 2022 · Artificial Intelligence

NeRF-Editing: Geometry Editing of Neural Radiance Fields

NeRF‑Editing introduces an interactive framework that lets users freely deform the geometry of neural radiance fields by coupling an explicit mesh with implicit NeRF representations, propagating mesh vertex changes through tetrahedral ARAP optimization to bend rays during rendering, enabling realistic edits and animations on synthetic and real‑world scenes, a first reported at CVPR 2022.

3D reconstructionARAP deformationComputer Vision

0 likes · 6 min read

NeRF-Editing: Geometry Editing of Neural Radiance Fields

ITPUB

Jun 9, 2022 · Artificial Intelligence

How 58’s Multi‑Label Image Recognition Boosts Semantic Search and Recommendations

This article details the design, data pipeline, model architecture, loss functions, and evaluation metrics of a large‑scale multi‑label image classification system built for 58.com, showing how it improves semantic similarity detection, recommendation, and content moderation across diverse business domains.

Computer VisionDeep Learningasymmetric loss

0 likes · 18 min read

How 58’s Multi‑Label Image Recognition Boosts Semantic Search and Recommendations

Python Programming Learning Circle

Jun 9, 2022 · Artificial Intelligence

Python Nude Image Detection Using Pillow: Algorithm, Implementation, and Visualization

This tutorial explains how to build a Python program that detects nude images by analyzing skin-colored regions with Pillow, covering project setup, image preprocessing, pixel classification using RGB/HSV/YCrCb formulas, region merging, decision rules, and command‑line usage with optional visualization.

Computer VisionImage ProcessingNude Detection

0 likes · 23 min read

Python Nude Image Detection Using Pillow: Algorithm, Implementation, and Visualization

58 Tech

Jun 9, 2022 · Artificial Intelligence

Multi‑Label Image Recognition for 58.com: Algorithm Design, Data Construction, and Model Optimization

This article presents a comprehensive study of multi‑label image recognition applied to 58.com’s business scenarios, covering problem motivation, dataset construction, evaluation metrics, mainstream deep‑learning methods, an asymmetric‑loss‑based optimization pipeline, and practical output schemes for recommendation and retrieval.

Computer Visionasymmetric lossdata annotation

0 likes · 17 min read

Multi‑Label Image Recognition for 58.com: Algorithm Design, Data Construction, and Model Optimization

Python Programming Learning Circle

Jun 8, 2022 · Artificial Intelligence

Creating a Pencil Sketch from an Image Using OpenCV in Python

This tutorial walks through installing OpenCV, selecting an image, converting it to grayscale, inverting it, applying Gaussian blur, and finally combining the results to generate a pencil‑sketch effect, with complete Python code and display commands.

Computer VisionImage ProcessingOpenCV

0 likes · 3 min read

Creating a Pencil Sketch from an Image Using OpenCV in Python

DaTaobao Tech

Jun 8, 2022 · Artificial Intelligence

Modeling Indirect Illumination for Inverse Rendering

The CVPR‑2022 paper by Alibaba’s Taobao Tech and Zhejiang University introduces a neural‑radiance‑field‑based method that directly models indirect illumination via a signed‑distance‑field geometry and spherical‑Gaussian visibility, avoiding costly path tracing and enabling more accurate recovery of geometry, material and lighting for realistic free‑viewpoint relighting.

BRDFComputer Visionindirect illumination

0 likes · 9 min read

Modeling Indirect Illumination for Inverse Rendering

Youku Technology

Jun 7, 2022 · Artificial Intelligence

Mobile Real-Time Portrait Segmentation for Youku Bullet Comment Passthrough

To enable real‑time bullet‑comment passthrough on Youku’s mobile app, the team built a million‑scale portrait dataset and designed the AirSegNet series—CPU, GPU, and server variants—using VGG‑style nets, edge‑aware losses, and hybrid CPU‑GPU inference, achieving 0.98 IoU and sub‑15 ms latency on most devices.

Computer VisionEdge ComputingMNN Framework

0 likes · 13 min read

Mobile Real-Time Portrait Segmentation for Youku Bullet Comment Passthrough

NetEase Smart Enterprise Tech+

Jun 2, 2022 · Artificial Intelligence

How Knowledge Distillation Shrinks Deep Neural Networks Without Losing Accuracy

Knowledge Distillation, a teacher‑student model compression technique, enables large, high‑performing deep neural networks to transfer their learned representations to smaller models, achieving comparable accuracy with faster inference, reduced resource consumption, and broader applicability in computer‑vision tasks.

AIComputer VisionFitNet

0 likes · 14 min read

How Knowledge Distillation Shrinks Deep Neural Networks Without Losing Accuracy

Java Backend Technology

May 28, 2022 · Artificial Intelligence

5 Mind-Blowing Open-Source Projects That Let You Control Faces, Erase Spiders, and Hack Wi-Fi

This article showcases five cutting‑edge open‑source projects—from a ROS‑based system that lets a gamepad animate facial muscles, to AI‑driven video inpainting, text‑to‑image generation, eye‑gaze computer control, and a comprehensive Wi‑Fi cracking toolkit—each pushing the boundaries of modern tech.

AIComputer VisionRobotics

0 likes · 6 min read

5 Mind-Blowing Open-Source Projects That Let You Control Faces, Erase Spiders, and Hack Wi-Fi

Youku Technology

May 18, 2022 · Artificial Intelligence

Subjective and Objective Quality of Experience of Free Viewpoint Videos – Paper Overview

This IEEE TIP paper presents a large‑scale subjective‑objective study of Free Viewpoint Video quality, introducing a cost‑saving two‑stage labeling workflow, a sparse‑frame benchmark model, and publicly releasing the dataset and code, with contributions from Alibaba’s Moku Lab and Jiangxi University researchers.

Computer VisionDatasetFree Viewpoint Video

0 likes · 5 min read

Subjective and Objective Quality of Experience of Free Viewpoint Videos – Paper Overview

Code DAO

May 18, 2022 · Artificial Intelligence

A Practical Guide to PyTorch Visualization Tools for Deep Learning

This article walks through the core PyTorch visualization utilities—making image grids, drawing bounding boxes, segmentation masks, and keypoints—explaining why they are needed, how to set up the pipeline, and providing complete code examples for each computer‑vision task.

Bounding BoxesComputer VisionKeypoints

0 likes · 18 min read

A Practical Guide to PyTorch Visualization Tools for Deep Learning

DaTaobao Tech

May 11, 2022 · Artificial Intelligence

AdaInt: Learning Adaptive Intervals for 3D Lookup Tables in Real‑time Image Enhancement

AdaInt introduces a lightweight convolutional network that predicts non‑uniform sampling coordinates and basis 3D LUTs, using a differentiable binary‑search AiLUT‑Transform to enable end‑to‑end training, thereby delivering superior PSNR, negligible extra parameters, and real‑time color enhancement on ultra‑high‑resolution images, outperforming prior state‑of‑the‑art methods.

3D LUTComputer VisionReal-Time

0 likes · 11 min read

AdaInt: Learning Adaptive Intervals for 3D Lookup Tables in Real‑time Image Enhancement

Bilibili Tech

May 10, 2022 · Artificial Intelligence

Glance Supervised Video Moment Retrieval via the ViGA Framework

The paper presents a glance‑supervised video moment retrieval approach that records a single annotator‑seen frame, introduces the ViGA contrastive learning framework to leverage this weak temporal cue, and demonstrates on three benchmarks performance rivaling fully supervised methods while keeping annotation cost minimal.

Computer VisionGlance SupervisionViGA

0 likes · 8 min read

Glance Supervised Video Moment Retrieval via the ViGA Framework

Code DAO

May 10, 2022 · Artificial Intelligence

How Geometric Deep Learning Enables Spherical CNNs for Rotationally Equivariant Vision

The article explains why traditional planar CNNs fail on spherical data, describes how encoding rotational symmetry through continuous spherical representations and spherical harmonics leads to spherical convolutions that are rotation‑equivariant, and outlines the practical computation using harmonic coefficients.

Computer Visiongeometric-deep-learningrotational equivariance

0 likes · 9 min read

How Geometric Deep Learning Enables Spherical CNNs for Rotationally Equivariant Vision

Tencent Cloud Developer

Apr 27, 2022 · Artificial Intelligence

Alignment-Uniformity Representation Learning for Zero-shot Video Classification (AURL)

The AURL framework, presented by Pu Shi, introduces alignment‑uniformity aware representation learning for zero‑shot video classification, achieving up to 28 % top‑1 accuracy gains on UCF101 and HMDB51, and has already boosted business metrics in Tencent’s advertising, search, and video‑channel recommendation systems.

AlignmentComputer VisionDeep Learning

0 likes · 19 min read

Alignment-Uniformity Representation Learning for Zero-shot Video Classification (AURL)

Python Programming Learning Circle

Apr 26, 2022 · Artificial Intelligence

Python Script for Adding Face Masks to CelebA Images Using the face_recognition Library

This article demonstrates how to use Python, the face_recognition library, and OpenCV/Pillow to automatically detect facial landmarks in CelebA images, generate and align mask overlays, and save both masked and binary mask versions for computer‑vision research and dataset augmentation.

Computer VisionImage ProcessingPython

0 likes · 11 min read

Python Script for Adding Face Masks to CelebA Images Using the face_recognition Library

MaGe Linux Operations

Apr 24, 2022 · Artificial Intelligence

How to Automatically Add Face Masks to Images with Python and Face Recognition

This article demonstrates how to use the Python face_recognition library and Pillow to detect facial landmarks, generate realistic mask overlays, and produce both masked and binary mask images from the open‑source FaceMask_CelebA dataset, complete with full source code.

Computer VisionImage Processingface recognition

0 likes · 12 min read

How to Automatically Add Face Masks to Images with Python and Face Recognition

Meituan Technology Team

Apr 14, 2022 · Artificial Intelligence

Short Video Content Understanding and Generation Practices at Meituan

Meituan leverages computer‑vision techniques to tag, analyze, and automatically generate short videos across consumer and merchant scenarios, detailing hierarchical tag design, self‑supervised representation learning, fine‑grained food recognition, intelligent cover creation, and pixel‑level editing to enhance content discovery and presentation.

AI content generationComputer Visionfine-grained recognition

0 likes · 20 min read

Short Video Content Understanding and Generation Practices at Meituan

Kuaishou Tech

Apr 11, 2022 · Artificial Intelligence

Kuaishou's Custom Video Matting Solution: Interactive Object Segmentation for Mobile Creators

Kuaishou's audio‑video technology team presents a self‑developed custom video matting system that combines foreground, interactive, and video object segmentation to let creators extract arbitrary subjects without green screens, featuring adaptive cropping, multi‑stage training, and deployment across Android and iOS devices.

Computer VisionDeep LearningKuaishou

0 likes · 15 min read

Kuaishou's Custom Video Matting Solution: Interactive Object Segmentation for Mobile Creators

Python Programming Learning Circle

Apr 9, 2022 · Artificial Intelligence

Image Resizing with OpenCV and PyTorch

This article explains how to resize images using OpenCV's cv2.resize function and how to scale multi‑dimensional tensors in PyTorch with torch.nn.functional.interpolate, providing detailed parameter descriptions and practical code examples for both single images and batch processing.

Computer VisionImage ProcessingPyTorch

0 likes · 6 min read

Meituan Technology Team

Apr 7, 2022 · Mobile Development

Zero‑Code Scripted Guidance for Mobile Apps Using CV and AI

The ASG system delivers stack‑agnostic, zero‑code in‑app guidance by combining traditional computer‑vision matching with deep‑learning detectors, enabling product teams to author scripts visually, cut development time below half a person‑day, boost task completion from 18 % to 35.7 %, and slash costs over 90 %.

Computer VisionMobile Developmentimage matching

0 likes · 31 min read

Zero‑Code Scripted Guidance for Mobile Apps Using CV and AI

Kuaishou Large Model

Apr 6, 2022 · Artificial Intelligence

How Transformers Revolutionize Image Style Transfer: Introducing StyTr²

This article reviews the limitations of traditional CNN‑based image stylization, explains how Transformer architectures overcome these issues with global context and self‑attention, and presents the novel StyTr² method with content‑aware positional encoding that achieves superior, detail‑preserving style transfer results.

Computer VisionDeep LearningTransformer

0 likes · 8 min read

How Transformers Revolutionize Image Style Transfer: Introducing StyTr²

Tencent Architect

Apr 6, 2022 · Artificial Intelligence

Award-Winning AIoT Projects from the 2021 TencentOS Tiny AIoT Innovation Competition

The 2021 TencentOS Tiny AIoT Innovation Competition showcased over 50 original projects, including award‑winning multi‑functional pedestrian detection devices, AI‑enhanced smart wheelchairs, and endangered‑animal recognition systems, each demonstrating low‑power embedded AI, edge computing, and cloud integration for diverse real‑world applications.

AIoTComputer VisionEdge Computing

0 likes · 8 min read

Award-Winning AIoT Projects from the 2021 TencentOS Tiny AIoT Innovation Competition

Kuaishou Tech

Apr 6, 2022 · Artificial Intelligence

StyTr²: A Transformer‑Based Approach for Image Style Transfer

The paper proposes StyTr², a Transformer‑based image style transfer method that uses content‑aware positional encoding to preserve details and improve feature representation, achieving high‑quality stylization with better content structure and style patterns.

Computer VisionDeep Learningcontent-aware positional encoding

0 likes · 7 min read

StyTr²: A Transformer‑Based Approach for Image Style Transfer

Kuaishou Large Model

Mar 25, 2022 · Artificial Intelligence

How Kuaishou Y‑Tech Achieves Real‑Time High‑Fidelity Wrinkle Removal with AI

This article explains how Kuaishou Y‑Tech combines image inpainting, semantic editing, and advanced GAN techniques to accurately locate and remove facial wrinkles—especially neck wrinkles—while preserving realistic skin texture and achieving high‑quality results suitable for production deployment.

Computer VisionDeep LearningGAN

0 likes · 17 min read

How Kuaishou Y‑Tech Achieves Real‑Time High‑Fidelity Wrinkle Removal with AI

Laiye Technology Team

Mar 25, 2022 · Artificial Intelligence

Laiye OCR Error‑Correction Model: Architecture, Implementation, and Evaluation

This article describes Laiye's OCR error‑correction system, detailing the background challenges of Chinese character recognition, the analysis of three possible solutions, the chosen post‑processing approach, model architecture, training data, loss design, online inference, and experimental results showing a measurable performance boost.

Chinese textComputer VisionDeep Learning

0 likes · 13 min read

Laiye OCR Error‑Correction Model: Architecture, Implementation, and Evaluation

JD Cloud Developers

Mar 21, 2022 · Artificial Intelligence

ViTAEv2 Breaks ImageNet Real Record with 91.2% Accuracy – How a 600M‑Parameter Model Redefines Few‑Shot Learning

JD Research Institute and the University of Sydney introduced ViTAEv2, a 600‑million‑parameter deep learning model that achieved a world‑leading 91.2% top‑1 accuracy on ImageNet Real without external data, demonstrating strong few‑shot learning, reducing labeling costs, and promising advances across many computer‑vision tasks.

AI modelComputer VisionDeep Learning

0 likes · 4 min read

ViTAEv2 Breaks ImageNet Real Record with 91.2% Accuracy – How a 600M‑Parameter Model Redefines Few‑Shot Learning

Python Programming Learning Circle

Mar 16, 2022 · Artificial Intelligence

Comprehensive Overview of Face Recognition Techniques and DeepFace Implementation

This article provides a detailed survey of face recognition evolution, key researchers, open‑source projects, datasets, the four‑stage processing pipeline, DeepFace architecture, training methods, experimental results, and practical Python installation and usage instructions.

Computer VisionDeepFacePython

0 likes · 18 min read

Comprehensive Overview of Face Recognition Techniques and DeepFace Implementation

JD Retail Technology

Mar 7, 2022 · Artificial Intelligence

AI-Driven UI Testing: Data Collection, Model Development, and Deployment for Mobile App Anomaly Detection

This article presents a comprehensive study on applying AI and deep‑learning techniques to mobile UI testing, covering background challenges, feasibility research, abnormal sample construction, model design, training, evaluation, and future directions for intelligent test automation.

AI testingComputer VisionModel Training

0 likes · 13 min read

AI-Driven UI Testing: Data Collection, Model Development, and Deployment for Mobile App Anomaly Detection

Kuaishou Large Model

Mar 4, 2022 · Artificial Intelligence

How Adaptive 3D Face Cutout Transforms Kuaishou’s AR Effects

This article explains the adaptive 3D face cutout technology behind Kuaishou's "3D Zoom Face" effect, detailing its problem‑solving approach, implementation workflow, camera‑control optimizations, and how it expands creative possibilities while lowering production costs for both creators and users.

3D renderingAR effectsComputer Vision

0 likes · 16 min read

How Adaptive 3D Face Cutout Transforms Kuaishou’s AR Effects

JD Cloud Developers

Mar 3, 2022 · Artificial Intelligence

How JD Explore’s Silver‑Bullet‑3D Dominated the SAPIEN ManiSkill Challenge

JD Explore Research Institute’s Visual and Multimedia Lab team “Silver‑Bullet‑3D” secured top positions in the 2021 SAPIEN ManiSkill Challenge by excelling in both imitation‑learning and rule‑based tracks, showcasing cutting‑edge computer‑vision and robotic‑arm control technologies that earned them international recognition.

AI competitionComputer VisionRobotics

0 likes · 5 min read

How JD Explore’s Silver‑Bullet‑3D Dominated the SAPIEN ManiSkill Challenge

Python Crawling & Data Mining

Feb 22, 2022 · Artificial Intelligence

Create a Dancing Word‑Cloud Video with Python and AI

This tutorial walks through downloading a dance video, extracting frames, using Baidu AI for person segmentation, generating word‑cloud masks, and stitching the results into a dancing word‑cloud video with Python, OpenCV and the WordCloud library.

Baidu AIComputer VisionOpenCV

0 likes · 8 min read

Create a Dancing Word‑Cloud Video with Python and AI

MaGe Linux Operations

Feb 16, 2022 · Artificial Intelligence

Recreate Wuhan University’s Cherry Blossom Bloom with Python and OpenCV

This tutorial shows how to use Python, OpenCV, and Pillow to capture, process, and animate Wuhan University’s cherry blossom scenes, turning pixel data into a time‑lapse video with custom text overlays and frame‑by‑frame control.

Computer VisionImage ProcessingOpenCV

0 likes · 5 min read

Recreate Wuhan University’s Cherry Blossom Bloom with Python and OpenCV

Kuaishou Tech

Feb 9, 2022 · Mobile Development

Kuaishou Mobile Mixed Reality System: Architecture, Algorithms, and Applications

This article presents Kuaishou's mobile mixed reality (MR) system, detailing its integration of deep learning, SLAM, and scene reconstruction for real‑time spatial computing, the design of a monocular depth‑estimation model, a lightweight 3D rendering engine, and its deployment across iOS and Android devices with various user‑facing effects.

Computer VisionDepth estimationKuaishou

0 likes · 16 min read

Kuaishou Mobile Mixed Reality System: Architecture, Algorithms, and Applications

Baobao Algorithm Notes

Jan 28, 2022 · Artificial Intelligence

How Masked Autoencoders Revolutionize Vision Pre‑Training: A Deep Dive

This article provides a detailed technical walkthrough of Masked Autoencoders (MAE) for computer vision, covering its BERT‑inspired masking strategy, asymmetric encoder‑decoder design, implementation specifics, experimental findings on mask ratios and decoder depth, and the resulting performance gains over supervised ViT models.

Computer VisionMAEMasked Modeling

0 likes · 11 min read

How Masked Autoencoders Revolutionize Vision Pre‑Training: A Deep Dive

Kuaishou Tech

Jan 27, 2022 · Artificial Intelligence

Kuaishou’s Self‑Developed Green‑Screen Matting Algorithm and Its Deployment in Kuaiying, Live Companion, and Cloud Editing

This article explains the principles, challenges, and implementation details of Kuaishou’s proprietary green‑screen matting algorithm, covering fine‑detail handling, color‑spill reduction, green‑reflection removal, and its real‑time deployment across mobile video‑editing and live‑streaming products.

Computer VisionKuaishouReal-time Processing

0 likes · 13 min read

Kuaishou’s Self‑Developed Green‑Screen Matting Algorithm and Its Deployment in Kuaiying, Live Companion, and Cloud Editing

Kuaishou Tech

Jan 26, 2022 · Artificial Intelligence

Technical Overview of Kuaishou Y‑Tech Body‑Shaping Effects and Underlying Algorithms

This article explains how Kuaishou's Y‑Tech leverages human detection, keypoint localization, and image‑deformation algorithms such as stretching, triangulation and liquify, together with background‑distortion correction, to deliver seven stable, natural body‑shaping effects for short‑video applications.

AIComputer Visionbody shaping

0 likes · 13 min read

Technical Overview of Kuaishou Y‑Tech Body‑Shaping Effects and Underlying Algorithms

Kuaishou Large Model

Jan 22, 2022 · Artificial Intelligence

How Kuaishou Achieves Realistic Body Beautification with AI‑Driven Pose Detection and Image Warping

This article explains Kuaishou’s Y‑tech body‑beautification pipeline, detailing how proprietary human pose detection, key‑point localization, and image‑warping techniques such as stretching, triangulation, and liquify are combined to create stable, natural effects like long‑leg, slim‑waist, and swan‑neck, while minimizing background distortion.

AIComputer Visionbody beautification

0 likes · 15 min read

How Kuaishou Achieves Realistic Body Beautification with AI‑Driven Pose Detection and Image Warping

Baidu Geek Talk

Jan 17, 2022 · Artificial Intelligence

Unlocking Video AI: PaddleVideo’s Open‑Source Solutions for Sports, Media, and Safety

This article surveys PaddleVideo, Baidu's open‑source video AI toolkit, detailing its industry‑focused models for sports action recognition, multimodal tagging, intelligent production, interactive segmentation, drone detection, and medical imaging, while providing performance metrics and GitHub resources for each solution.

Computer VisionMultimodal LearningPaddleVideo

0 likes · 14 min read

Unlocking Video AI: PaddleVideo’s Open‑Source Solutions for Sports, Media, and Safety

DataFunSummit

Jan 5, 2022 · Artificial Intelligence

Improving Financial Micro‑Business Efficiency with OCR: Challenges, Applications, and an Intelligent Platform

This article explores how optical character recognition (OCR) technology can address the financing pain points of micro‑enterprises by automating document verification, enhancing risk assessment, and enabling an end‑to‑end intelligent OCR platform built on deep‑learning models, data pipelines, and deployment automation.

Computer VisionDocument AutomationMicro Business

0 likes · 15 min read

Improving Financial Micro‑Business Efficiency with OCR: Challenges, Applications, and an Intelligent Platform

Code DAO

Dec 31, 2021 · Artificial Intelligence

Why RegNet Is the Most Flexible Architecture for Computer Vision

RegNet introduces a scalable design space defined by quantized linear functions, enabling flexible trade‑offs between accuracy, efficiency, and mobile deployment, and demonstrates superior performance compared with ResNet, EfficientNet, and other mobile‑optimized networks.

Computer VisionDeep LearningDesign Space

0 likes · 7 min read

Why RegNet Is the Most Flexible Architecture for Computer Vision

Laiye Technology Team

Dec 31, 2021 · Artificial Intelligence

Overview of Table Recognition Techniques and Practical Implementation

This article reviews the challenges of extracting structured table data from images, compares two‑stage and end‑to‑end OCR approaches, evaluates four state‑of‑the‑art table‑recognition models (SPLERGE, CascadeTabNet, TableMASTER, UnetTable), and presents a practical deployment workflow with performance metrics.

AIComputer VisionDeep Learning

0 likes · 14 min read

Overview of Table Recognition Techniques and Practical Implementation

Code DAO

Dec 29, 2021 · Artificial Intelligence

Understanding Stand-Alone Axial-Attention for Panoptic Segmentation

The paper proposes a stand‑alone axial‑attention mechanism that converts 2‑D attention into 1‑D to lower computational cost while preserving global context, introduces position‑sensitive self‑attention, integrates it into Axial‑ResNet and Axial‑DeepLab, and demonstrates strong results on four large segmentation datasets.

Axial AttentionComputer VisionDeepLab

0 likes · 7 min read

Understanding Stand-Alone Axial-Attention for Panoptic Segmentation

Laravel Tech Community

Dec 27, 2021 · Artificial Intelligence

OpenCV 4.5.5 Release Highlights and New Features

OpenCV 4.5.5 introduces audio support in VideoCapture, updates SOVERSION handling, adds OpenVINO 2021.4.2 LTS compatibility, expands ONNX test coverage, upgrades protobuf, optimizes for RISC‑V, and enhances the G‑API module with numerous vectorized kernels, SIMD scheduling, and various bug fixes.

AIComputer VisionG-API

0 likes · 3 min read

OpenCV 4.5.5 Release Highlights and New Features

Code DAO

Dec 22, 2021 · Artificial Intelligence

Understanding SimCLR: A Simple Contrastive Learning Framework for Visual Representations

This article explains SimCLR, the 2020 Google Research framework that advances self‑supervised visual pre‑training by using extensive data augmentations, a ResNet encoder, a projection‑head MLP, and the NT‑Xent loss to learn robust image representations that outperform many prior methods on ImageNet and other benchmarks.

Computer VisionNT-Xent lossResNet

0 likes · 7 min read

Understanding SimCLR: A Simple Contrastive Learning Framework for Visual Representations

ITPUB

Dec 13, 2021 · Artificial Intelligence

How Data Augmentation Boosts Machine Learning When Data Is Scarce

This article explains how data augmentation can alleviate overfitting by artificially expanding limited training sets, outlines common transformation techniques for images, text, and audio, and discusses the method's benefits, practical applications, and inherent limitations for machine‑learning practitioners.

Computer VisionDeep Learningdata augmentation

0 likes · 6 min read

How Data Augmentation Boosts Machine Learning When Data Is Scarce

Code DAO

Dec 12, 2021 · Artificial Intelligence

Lightning Flash 0.3 Introduces New Tasks, Visualization Tools, Data Pipelines, and Registry API

Lightning Flash 0.3 expands the PyTorch Lightning ecosystem with eight new computer‑vision and NLP tasks, modular API design, integrated model hubs, visualisation callbacks, customizable data‑source hooks, and a central registry for model backbones, all illustrated with concrete code examples.

Computer VisionDeep LearningLightning Flash

0 likes · 7 min read

Lightning Flash 0.3 Introduces New Tasks, Visualization Tools, Data Pipelines, and Registry API

Kuaishou Large Model

Dec 10, 2021 · Artificial Intelligence

How AI Restores Blurry Faces: Inside Kuaishou’s Y‑Tech High‑Definition Portrait Project

Image clarity impacts daily life, from personal memories to security, and Kuaishou’s Y‑Tech team tackles degradation by constructing paired low‑high quality datasets and a style‑based AI model that leverages facial masks to restore high‑definition portraits, preserving identity while enhancing detail.

AIComputer VisionDeep Learning

0 likes · 10 min read

How AI Restores Blurry Faces: Inside Kuaishou’s Y‑Tech High‑Definition Portrait Project

Code DAO

Dec 5, 2021 · Artificial Intelligence

Why DropBlock Outperforms Dropout as an Image Regularizer

This article demonstrates how to implement DropBlock in PyTorch, explains why Dropout fails on image data, details the gamma calculation and mask generation, and shows visual comparisons that illustrate the superiority of contiguous region dropping over random pixel dropout.

Computer VisionDeep LearningDropBlock

0 likes · 11 min read

Why DropBlock Outperforms Dropout as an Image Regularizer

Java Captain

Dec 4, 2021 · Artificial Intelligence

Java Spring Boot License Plate Recognition and Training System (Open‑Source)

This open‑source project implements a Spring Boot and Maven based license‑plate detection and training system in Java, leveraging OpenCV and JavaCPP, supporting multiple plate colors, SVM and ANN algorithms, and providing a B/S architecture with SQLite, Swagger documentation, and extensible image‑recognition features.

Computer VisionDeep LearningImage Processing

0 likes · 4 min read

Java Spring Boot License Plate Recognition and Training System (Open‑Source)

Kuaishou Large Model

Dec 3, 2021 · Artificial Intelligence

How Can Your Face Reveal Heart Rate? Exploring rPPG Technology

This article explains the principles of remote photoplethysmography (rPPG), how facial skin color changes caused by heartbeats can be captured by a camera to measure heart rate, respiration, SpO₂ and other physiological signals, and reviews traditional and data‑driven algorithms for robust signal extraction.

AIComputer Visionheart rate detection

0 likes · 7 min read

How Can Your Face Reveal Heart Rate? Exploring rPPG Technology

Kuaishou Tech

Dec 1, 2021 · Industry Insights

Turning Sketches into Live AR Characters: Kuaishou’s All‑Things‑AR Technical Journey

This article details how Kuaishou transformed a user‑drawn sketch concept into the All‑Things‑AR feature, covering background inspiration, the end‑to‑end pipeline, data collection, mobile‑friendly segmentation model design, model optimizations, engineering integration, SLAM‑based camera localization, and final production results.

ARComputer VisionMobile Development

0 likes · 15 min read

Turning Sketches into Live AR Characters: Kuaishou’s All‑Things‑AR Technical Journey

21CTO

Nov 27, 2021 · Artificial Intelligence

How Huawei’s “Genius Teen” Scaled AutoML to Millions of Phones

Huawei’s 201‑million‑yuan “genius teen” Zhong Zhao leveraged AutoML to deploy high‑precision image‑pixel processing algorithms across tens of millions of Mate and P series smartphones, pioneering large‑scale commercial use of AutoML and advancing mobile visual models with dynamic convolution kernels and adversarial data augmentation.

AutoMLComputer VisionDeep Learning

0 likes · 9 min read

How Huawei’s “Genius Teen” Scaled AutoML to Millions of Phones

Kuaishou Large Model

Nov 26, 2021 · Artificial Intelligence

How Kuaishou’s ‘All‑Things AR’ Turns Real Objects into Interactive 3D Characters

‘All‑Things AR’ (万物AR) is a Kuaishou Y‑tech solution that lets users capture any real‑world object with a phone, automatically segments it using a custom AI model, and renders an animated 3D avatar via a lightweight SLAM‑based pipeline, enabling low‑cost, high‑quality AR experiences.

ARComputer VisionMobile AI

0 likes · 16 min read

How Kuaishou’s ‘All‑Things AR’ Turns Real Objects into Interactive 3D Characters

DeWu Technology

Nov 18, 2021 · Artificial Intelligence

Background Complexity Detection for Sneaker Images Using MobileNet, FPN, and Modified SAM

The project presents a lightweight MobileNet‑FPN architecture enhanced with a modified spatial‑attention module that evaluates corner‑based self‑similarity to classify sneaker photo backgrounds, achieving 96% test accuracy—exceeding baseline CNN performance—and meeting business targets of over 80% hint accuracy and 90% mandatory enforcement.

CNNComputer VisionImage Processing

0 likes · 12 min read

Background Complexity Detection for Sneaker Images Using MobileNet, FPN, and Modified SAM

DataFunTalk

Nov 16, 2021 · Artificial Intelligence

InsightFace: Open‑Source 2D/3D Deep Face Analysis Toolbox with PaddlePaddle Support

InsightFace is an open‑source 2D/3D deep face analysis toolbox that implements a variety of detection, alignment and recognition algorithms, now supports PaddlePaddle with out‑of‑the‑box models, high‑throughput distributed training up to 60 million classes, and provides a one‑line demo script for quick testing.

ArcFaceComputer VisionDeep Learning

0 likes · 3 min read

InsightFace: Open‑Source 2D/3D Deep Face Analysis Toolbox with PaddlePaddle Support

Alibaba Terminal Technology

Nov 15, 2021 · Artificial Intelligence

How AI Powers Smart Home Workouts on Mobile: Alibaba Sports’ Pose‑Tracking

Alibaba Sports’ AI-powered smart workout system transforms a simple smartphone and a few square meters of space into an interactive home fitness solution, using MNN‑based pose estimation to recognize and correct dozens of exercises, while addressing challenges like accuracy, performance, and automated testing.

AIAutomated TestingComputer Vision

0 likes · 11 min read

How AI Powers Smart Home Workouts on Mobile: Alibaba Sports’ Pose‑Tracking

Python Programming Learning Circle

Nov 13, 2021 · Artificial Intelligence

Python Panorama Stitching Using OpenCV and SIFT

This article explains how to create a panoramic image by detecting SIFT keypoints, matching them with KNN, estimating a homography using RANSAC, and warping and blending two overlapping photos with OpenCV in Python.

Computer VisionPythonSIFT

0 likes · 8 min read

Python Panorama Stitching Using OpenCV and SIFT

Amap Tech

Nov 4, 2021 · Artificial Intelligence

POI Signboard Image Retrieval: Technical Solution, Model Design, and Future Directions

To efficiently filter unchanged POI signboards, the authors propose a multimodal image‑retrieval system that combines enhanced global and local visual features with BERT‑encoded OCR text, using metric learning and alignment techniques to achieve over 95 % accuracy while handling occlusion, viewpoint variation, and subtle text changes.

Computer VisionDeep LearningMultimodal Learning

0 likes · 17 min read

POI Signboard Image Retrieval: Technical Solution, Model Design, and Future Directions

Cyber Elephant Tech Team

Oct 14, 2021 · Artificial Intelligence

Mastering OCR: From Traditional Techniques to Deep Learning Solutions

This article provides a comprehensive overview of Optical Character Recognition, covering its traditional applications, the evolution to deep learning methods, key datasets, popular tools, and practical strategies for tackling diverse OCR challenges in real-world scenarios.

CRNNComputer VisionDatasets

0 likes · 18 min read

Mastering OCR: From Traditional Techniques to Deep Learning Solutions

DataFunTalk

Sep 29, 2021 · Artificial Intelligence

Self‑Supervised Learning and Contrastive Learning for Computer Vision and OCR Applications

This article reviews self‑supervised learning techniques, common computer‑vision pretext tasks, contrastive loss functions, popular frameworks such as SimCLR, MoCo and SimSiam, and demonstrates their application to OCR captcha recognition with detailed implementation and experimental results.

Computer VisionDeep LearningOCR

0 likes · 22 min read

Self‑Supervised Learning and Contrastive Learning for Computer Vision and OCR Applications

Laiye Technology Team

Sep 24, 2021 · Artificial Intelligence

Self‑Supervised Learning and Contrastive Methods for Computer Vision and OCR Applications

This article surveys self‑supervised learning techniques for computer‑vision tasks, explains common pretext tasks and contrastive loss designs, reviews representative models such as SimCLR, MoCo, SmAV and SimSiam, and demonstrates their practical impact on a captcha‑OCR system with measurable accuracy gains.

Computer VisionDeep LearningOCR

0 likes · 23 min read

Self‑Supervised Learning and Contrastive Methods for Computer Vision and OCR Applications

Kuaishou Tech

Sep 17, 2021 · Artificial Intelligence

SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

SnowflakeNet introduces a novel Snowflake Point Deconvolution architecture combined with a Skip-Transformer to progressively split seed points, enabling high‑quality point‑cloud completion that preserves fine‑grained geometric details such as smooth surfaces, sharp edges, and corners across dense and sparse datasets.

3D reconstructionComputer VisionDeep Learning

0 likes · 10 min read

SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

JD Retail Technology

Sep 8, 2021 · Artificial Intelligence

ARShoe: Real-Time Augmented Reality Shoe Try-On System on Smartphones

The paper presents ARShoe, the first practical real‑time augmented reality shoe try‑on system for smartphones, detailing its multi‑branch neural network, foot pose estimation, rendering pipeline, a newly built foot dataset, and extensive experiments demonstrating high accuracy and over 30 FPS performance on multiple devices.

ARComputer VisionMobile

0 likes · 6 min read

ARShoe: Real-Time Augmented Reality Shoe Try-On System on Smartphones

Baidu Geek Talk

Sep 8, 2021 · Artificial Intelligence

How PP‑OCRv2 Boosts OCR Speed and Accuracy with Five Key Innovations

The article provides a comprehensive technical overview of PaddleOCR's PP‑OCRv2, detailing its five major algorithmic enhancements, performance improvements over previous versions, historical milestones, core capabilities, and links to the open‑source repositories for developers interested in state‑of‑the‑art OCR solutions.

Computer VisionModel OptimizationOCR

0 likes · 10 min read

How PP‑OCRv2 Boosts OCR Speed and Accuracy with Five Key Innovations

NetEase Smart Enterprise Tech+

Sep 2, 2021 · Artificial Intelligence

How AI Detects Video Deepfakes: Techniques, Challenges, and Real-World Solutions

This article explores the rapid rise of AI‑generated video deepfakes, examines the four main manipulation techniques, discusses the inherent security risks, and presents NetEase Yidun’s comprehensive detection framework—including face‑detection‑based classification, semi‑supervised learning, feature fusion, and model distillation—to combat content‑security threats.

AI securityComputer VisionSemi-supervised Learning

0 likes · 12 min read

How AI Detects Video Deepfakes: Techniques, Challenges, and Real-World Solutions

Kuaishou Large Model

Aug 30, 2021 · Artificial Intelligence

How Kuaishou’s Y‑Tech Fixes Background Distortion in Portrait Beautification

This article explains the challenges of background distortion caused by portrait beautification effects, describes Kuaishou Y‑Tech’s line‑segment‑based optimization framework that preserves line slopes and triangle shapes, and demonstrates the algorithm’s effectiveness through before‑and‑after visual results.

Computer VisionImage Processingbackground correction

0 likes · 11 min read

How Kuaishou’s Y‑Tech Fixes Background Distortion in Portrait Beautification

Python Crawling & Data Mining

Aug 25, 2021 · Artificial Intelligence

Quickly Solve Captchas with the Lightweight ddddocr Python Library

This article introduces the ddddocr Python library for fast, code‑light captcha recognition, compares it with pillow + pytesseract and Baidu API, provides installation steps, usage examples, performance tips, and discusses its accuracy limits.

CaptchaComputer VisionOCR

0 likes · 4 min read

Quickly Solve Captchas with the Lightweight ddddocr Python Library

Cyber Elephant Tech Team

Aug 18, 2021 · Artificial Intelligence

How GANs Turn Sketches into Realistic Landscapes: Inside the “TuYa” Algorithm

This article explains the GAN‑based “TuYa” sketch‑to‑landscape algorithm presented at the Yidian News Hackathon, detailing its semantic image synthesis approach, the encoder, generator with SPADE, and PatchGAN discriminator, and discusses potential applications for designers and architects.

AIComputer VisionGAN

0 likes · 9 min read

How GANs Turn Sketches into Realistic Landscapes: Inside the “TuYa” Algorithm

Beike Product & Technology

Aug 13, 2021 · Artificial Intelligence

AI-Powered Intelligent Testing Platform for Frontend UI Quality Assurance

The article describes how an AI-driven testing platform combines computer‑vision, OCR, and machine‑learning techniques to automatically detect frontend UI and backend‑related quality issues in mobile apps, outlines its architecture, core capabilities, deployment workflow, and reports successful real‑world deployments and future plans.

AI testingComputer Visionfrontend quality

0 likes · 11 min read

AI-Powered Intelligent Testing Platform for Frontend UI Quality Assurance

Alimama Tech

Aug 11, 2021 · Artificial Intelligence

Dynamic Descriptive Model: A Scalable Paradigm for High‑Quality Native Creative Generation

The Dynamic Descriptive Model (DDM) introduces a scalable pipeline that automatically harvests product assets, perceives their visual attributes, encodes designers’ expertise in an extended SVG‑based descriptive language, and generates high‑quality, native‑looking ad creatives at massive scale, delivering 5‑80 % CTR gains and tens of millions of daily outputs.

AIAdvertisingComputer Vision

0 likes · 13 min read

Dynamic Descriptive Model: A Scalable Paradigm for High‑Quality Native Creative Generation

Test Development Learning Exchange

Aug 11, 2021 · Artificial Intelligence

Face Detection with OpenCV: Data Preparation, Cascade Classifiers, and Python Implementation

This guide explains how to prepare Haar and LBP data, use OpenCV's CascadeClassifier and detectMultiScale functions, and run a complete Python script that captures video, detects faces, draws bounding boxes, displays results, and saves detected frames.

Computer VisionHaar cascadeLBP

0 likes · 6 min read

Face Detection with OpenCV: Data Preparation, Cascade Classifiers, and Python Implementation

MaGe Linux Operations

Aug 9, 2021 · Artificial Intelligence

Top Python Libraries for Image Processing: A Practical Guide with Code

This article introduces the most popular Python image‑processing libraries, explains their core features, and provides ready‑to‑run code examples for tasks such as filtering, segmentation, and computer‑vision applications, helping readers quickly start working with images in Python.

Computer VisionImage ProcessingNumPy

0 likes · 9 min read

Top Python Libraries for Image Processing: A Practical Guide with Code

Test Development Learning Exchange

Aug 9, 2021 · Artificial Intelligence

Face Detection and OpenCV Haar Cascade Classifier

This article guides readers through downloading Haar cascade files for face detection using OpenCV, including code examples and step-by-step instructions.

Computer VisionHaar CascadesOpenCV

0 likes · 4 min read

Face Detection and OpenCV Haar Cascade Classifier

iQIYI Technical Product Team

Aug 6, 2021 · Artificial Intelligence

I2UV-HandNet: High‑Fidelity 3D Hand Mesh Reconstruction from Monocular RGB Images

I2UV-HandNet reconstructs high-fidelity 3D hand meshes from a single RGB image using an AffineNet encoder‑decoder to predict coarse UV maps and an SRNet super‑resolution module, trained on the SuperHandScan dataset, achieving real‑time performance and state‑of‑the‑art benchmark results, and targeting integration into next‑generation VR headsets without external controllers.

3D meshComputer VisionDeep Learning

0 likes · 11 min read

I2UV-HandNet: High‑Fidelity 3D Hand Mesh Reconstruction from Monocular RGB Images

TiPaiPai Technical Team

Aug 2, 2021 · Artificial Intelligence

How Attention Boosts Text Recognition: From CNN‑Seq2Seq to Multi‑Scale Models

This article explains how attention mechanisms are applied to text recognition, covering the basic CNN‑Seq2Seq‑Attention architecture, multi‑scale attention extensions, and a 2D attentional irregular scene text recognizer with detailed network components, training loss, and experimental results.

CNNComputer VisionDeep Learning

0 likes · 8 min read

How Attention Boosts Text Recognition: From CNN‑Seq2Seq to Multi‑Scale Models

MaGe Linux Operations

Jul 29, 2021 · Artificial Intelligence

Unlock Powerful Face Recognition with Python’s face_recognition Library

This article introduces the open‑source Python library face_recognition, explains how to install it, locate and extract faces, generate 128‑dimensional embeddings, compare faces, detect facial landmarks, apply virtual makeup, and build a simple custom face‑recognition application with complete code examples and visual results.

Computer VisionImage Processingface recognition

0 likes · 11 min read

Unlock Powerful Face Recognition with Python’s face_recognition Library

Python Programming Learning Circle

Jul 27, 2021 · Artificial Intelligence

Common Python Libraries for Image Processing: Overview and Code Examples

This article introduces the most widely used Python image‑processing libraries—including scikit‑image, NumPy, SciPy, Pillow, OpenCV‑Python, SimpleCV, Mahotas, SimpleITK, pgmagick, and Pycairo—explaining their key features and providing concise code snippets that demonstrate filtering, segmentation, enhancement, and computer‑vision tasks.

Computer VisionImage ProcessingNumPy

0 likes · 8 min read

Common Python Libraries for Image Processing: Overview and Code Examples

Test Development Learning Exchange

Jul 21, 2021 · Artificial Intelligence

Drawing Shapes on Images with OpenCV in Python

This tutorial demonstrates how to use OpenCV in Python to read an image and draw basic shapes such as rectangles and circles by specifying coordinates, dimensions, colors, and line thickness, then display the edited image in a window.

Computer VisionDrawing ShapesImage Processing

0 likes · 2 min read

Drawing Shapes on Images with OpenCV in Python

Test Development Learning Exchange

Jul 20, 2021 · Artificial Intelligence

Resizing Images with Python and OpenCV

This article demonstrates how to use Python's OpenCV library to read an image, display its original dimensions, resize it to a specified size, save the resized image, and handle user input to close the display windows.

Computer VisionImage ProcessingOpenCV

0 likes · 2 min read

Test Development Learning Exchange

Jul 17, 2021 · Artificial Intelligence

Face Recognition with OpenCV and Python

This tutorial explains the concept of facial recognition, describes how it works, and provides step‑by‑step instructions and code examples for implementing face detection and identification using OpenCV and Python, including installation, basic image handling, and a complete sample script.

Computer VisionOpenCVPython

0 likes · 4 min read

Kuaishou Large Model

Jul 15, 2021 · Artificial Intelligence

How Kuaishou’s YKit AI SDK Powers Mass‑Production of Viral Effects

The article details Kuaishou Y‑tech's YKit AI SDK architecture, its unified interface, modular design, performance optimizations, and three real‑world case studies that illustrate how the SDK enables large‑scale, high‑quality short‑video effects across diverse devices while addressing challenges of effect variety, performance, and cost.

AI SDKARComputer Vision

0 likes · 14 min read

How Kuaishou’s YKit AI SDK Powers Mass‑Production of Viral Effects

21CTO

Jul 14, 2021 · Artificial Intelligence

How a Chinese PhD’s Vision Research Earned a 2‑Million‑Yuan Huawei Offer

The article profiles Liao Minghui, a recent PhD graduate from Huazhong University of Science and Technology whose groundbreaking work in computer‑vision text detection earned him top honors, multiple patents, and a record‑breaking 2.01 million‑yuan annual salary offer from Huawei’s “Genius Youth” program.

Academic AchievementComputer VisionHuawei Recruitment

0 likes · 7 min read

How a Chinese PhD’s Vision Research Earned a 2‑Million‑Yuan Huawei Offer

TiPaiPai Technical Team

Jul 9, 2021 · Artificial Intelligence

How Multi‑Scale Attention and DenseNet Boost Handwritten Math Expression Recognition

This article reviews a CVPR 2018 paper that introduces a dense‑connected encoder and multi‑scale attention mechanism to improve handwritten mathematical expression recognition, detailing the background, network architecture, GRU decoder, loss function, and experimental gains over previous methods.

Computer VisionDenseNetGRU

0 likes · 8 min read

How Multi‑Scale Attention and DenseNet Boost Handwritten Math Expression Recognition

Beike Product & Technology

Jul 8, 2021 · Artificial Intelligence

Raster‑to‑Vector Floorplan Reconstruction (R2V) for Standardized Housing Layouts

This article presents the motivation, definitions, related work, and a detailed R2V (Raster‑to‑Vector) modeling pipeline—including DNN segmentation, integer programming, and vector standardization—used by Beike to standardize diverse floor‑plan images, discusses challenges, and outlines future directions, while also noting recruitment opportunities.

Computer Visionfloorplaninteger optimization

0 likes · 20 min read

Raster‑to‑Vector Floorplan Reconstruction (R2V) for Standardized Housing Layouts

Youku Technology

Jul 8, 2021 · Artificial Intelligence

Key Findings from Alibaba Moku Lab at ACM MM 2021

At ACM MM 2021, Alibaba’s Moku Lab presented four cutting‑edge studies: an interactive video inpainting system using user doodles, a decoupled IoU regression model for object detection, a spatio‑temporal distortion‑aware video quality assessment framework, and a multimodal emotional relationship recognition dataset and benchmark.

Computer VisionVideo Inpaintingmultimodal emotion recognition

0 likes · 8 min read

Key Findings from Alibaba Moku Lab at ACM MM 2021