Tagged articles

1236 articles

Page 7 of 13

Dec 5, 2021 · Artificial Intelligence

Why Neural Networks Need Batch Normalization: Principles and Mechanics

The article explains the principle behind Batch Normalization, why it is essential for training deep neural networks, how it standardizes activations, the role of learnable scale and shift parameters, the computation steps during training and inference, and discusses placement strategies within a model.

Batch NormalizationDeep LearningNeural Networks

0 likes · 9 min read

Why Neural Networks Need Batch Normalization: Principles and Mechanics

Code DAO

Dec 5, 2021 · Artificial Intelligence

Understanding DeepMind’s PonderNet: A Thinkable Network for MNIST

This article explains DeepMind’s PonderNet framework, which lets any neural network allocate computation adaptively, demonstrates its implementation with PyTorch Lightning on the MNIST dataset, details the underlying theory, loss functions, training procedure, and evaluates its pondering behavior on rotated digit experiments.

Adaptive ComputationDeep LearningMNIST

0 likes · 27 min read

Understanding DeepMind’s PonderNet: A Thinkable Network for MNIST

DataFunTalk

Dec 4, 2021 · Artificial Intelligence

Practical Deep Learning Training Tricks: Cyclic LR, Flooding, Warmup, RAdam, Adversarial Training, Focal Loss, Dropout, Normalization and More

This article compiles essential deep learning training techniques—including cyclic learning rates, flooding, warmup, RAdam optimizer, adversarial training, focal loss, dropout, batch/group/weight normalization, label smoothing, Wasserstein GAN, skip connections, and weight initialization—providing concise explanations and code snippets for each method.

Deep LearningNeural NetworksRegularization

0 likes · 11 min read

Practical Deep Learning Training Tricks: Cyclic LR, Flooding, Warmup, RAdam, Adversarial Training, Focal Loss, Dropout, Normalization and More

Java Captain

Dec 4, 2021 · Artificial Intelligence

Java Spring Boot License Plate Recognition and Training System (Open‑Source)

This open‑source project implements a Spring Boot and Maven based license‑plate detection and training system in Java, leveraging OpenCV and JavaCPP, supporting multiple plate colors, SVM and ANN algorithms, and providing a B/S architecture with SQLite, Swagger documentation, and extensible image‑recognition features.

Computer VisionDeep LearningImage Processing

0 likes · 4 min read

Java Spring Boot License Plate Recognition and Training System (Open‑Source)

Code DAO

Dec 1, 2021 · Artificial Intelligence

Building a Satellite Image Classifier with PyTorch ResNet34

This article walks through creating a satellite image classification pipeline using PyTorch and a pretrained ResNet34 model, covering dataset preparation, project structure, data loading, model definition, training, validation, loss/accuracy plotting, and inference on new images with detailed code examples and results.

Deep LearningImage ClassificationPyTorch

0 likes · 17 min read

Building a Satellite Image Classifier with PyTorch ResNet34

DataFunSummit

Nov 29, 2021 · Artificial Intelligence

Horovod Distributed Training Plugin: Design, Usage, and Deadlock Prevention

This article reviews Horovod, a popular third‑party distributed deep‑learning training plugin, explaining its simple three‑line integration, the challenges of deadlocks in all‑reduce operations, and the architectural components—including background threads, coordinators, and MPI/Gloo controllers—that enable scalable and efficient data‑parallel training.

Data ParallelDeep LearningDistributed Training

0 likes · 8 min read

Horovod Distributed Training Plugin: Design, Usage, and Deadlock Prevention

21CTO

Nov 27, 2021 · Artificial Intelligence

How Huawei’s “Genius Teen” Scaled AutoML to Millions of Phones

Huawei’s 201‑million‑yuan “genius teen” Zhong Zhao leveraged AutoML to deploy high‑precision image‑pixel processing algorithms across tens of millions of Mate and P series smartphones, pioneering large‑scale commercial use of AutoML and advancing mobile visual models with dynamic convolution kernels and adversarial data augmentation.

AutoMLComputer VisionDeep Learning

0 likes · 9 min read

How Huawei’s “Genius Teen” Scaled AutoML to Millions of Phones

Youzan Coder

Nov 23, 2021 · Mobile Development

Optimizing Mobile Barcode Scanning Performance: From ZXing Tuning to Deep Learning‑Based Barcode Region Detection

By profiling the Youzan app’s ZXing pipeline, eliminating costly image rotation and format conversions, restricting decoding to the two most common 1‑D types, and adding a lightweight deep‑learning barcode‑region detector, scan latency fell from 4.1 s to 1.5 s and success rose from 91 % to 97 %.

Barcode ScanningDeep LearningMobile Optimization

0 likes · 15 min read

Optimizing Mobile Barcode Scanning Performance: From ZXing Tuning to Deep Learning‑Based Barcode Region Detection

Meituan Technology Team

Nov 18, 2021 · Artificial Intelligence

Multi‑Business Product Ranking in Meituan Search: Challenges, Modeling Approaches, and Practical Results

Meituan Search tackles the difficulty of ranking items from diverse business lines by introducing a five‑tower mixed architecture, group‑lasso and feature‑gate selection, a probabilistic graph model, and a joint block‑order/size predictor, achieving notable offline NDCG gains and online CTR and purchase‑rate improvements.

Deep Learninge‑commercefeature selection

0 likes · 19 min read

Multi‑Business Product Ranking in Meituan Search: Challenges, Modeling Approaches, and Practical Results

DataFunTalk

Nov 16, 2021 · Artificial Intelligence

InsightFace: Open‑Source 2D/3D Deep Face Analysis Toolbox with PaddlePaddle Support

InsightFace is an open‑source 2D/3D deep face analysis toolbox that implements a variety of detection, alignment and recognition algorithms, now supports PaddlePaddle with out‑of‑the‑box models, high‑throughput distributed training up to 60 million classes, and provides a one‑line demo script for quick testing.

ArcFaceComputer VisionDeep Learning

0 likes · 3 min read

InsightFace: Open‑Source 2D/3D Deep Face Analysis Toolbox with PaddlePaddle Support

iQIYI Technical Product Team

Nov 5, 2021 · Artificial Intelligence

iQIYI’s QAV1 Encoder Achieves High Compression and Bandwidth Savings Using AV1 and Deep Learning

iQIYI’s QAV1 encoder, which combines the next‑generation AV1 codec with deep‑learning techniques, delivers 20‑42% bandwidth savings and up to 36% higher compression efficiency than x265 while maintaining ultrafast 60 fps encoding speeds, enabling high‑quality 4K/8K streaming and live broadcast across devices.

AV1Bandwidth ReductionDeep Learning

0 likes · 6 min read

iQIYI’s QAV1 Encoder Achieves High Compression and Bandwidth Savings Using AV1 and Deep Learning

Amap Tech

Nov 4, 2021 · Artificial Intelligence

POI Signboard Image Retrieval: Technical Solution, Model Design, and Future Directions

To efficiently filter unchanged POI signboards, the authors propose a multimodal image‑retrieval system that combines enhanced global and local visual features with BERT‑encoded OCR text, using metric learning and alignment techniques to achieve over 95 % accuracy while handling occlusion, viewpoint variation, and subtle text changes.

Computer VisionDeep LearningMultimodal Learning

0 likes · 17 min read

POI Signboard Image Retrieval: Technical Solution, Model Design, and Future Directions

Alibaba Cloud Developer

Nov 4, 2021 · Artificial Intelligence

How AI Powers POI Signboard Image Retrieval for Map Services

This article explains the challenges of POI signboard image retrieval, describes a multimodal deep‑learning solution that combines visual and OCR‑based text features, details data generation, model architecture, loss functions, and presents impressive accuracy improvements and future research directions.

Deep LearningMultimodal LearningPOI mapping

0 likes · 17 min read

How AI Powers POI Signboard Image Retrieval for Map Services

Alimama Tech

Nov 3, 2021 · Artificial Intelligence

Curvature Learning Framework (CurvLearn): A TensorFlow‑Based Library for Non‑Euclidean Deep Learning

CurvLearn is a TensorFlow-based open-source library enabling deep learning on curved manifolds (hyperbolic, spherical, mixed) with manifold implementations, Riemannian operations, optimizers, and distributed training, and it has been applied to recommendation, graph, and NLP tasks while providing custom ANN tools and practical training tips.

Curvature LearningDeep LearningManifold Optimization

0 likes · 13 min read

Curvature Learning Framework (CurvLearn): A TensorFlow‑Based Library for Non‑Euclidean Deep Learning

Baidu Maps Tech Team

Nov 3, 2021 · Artificial Intelligence

How AI Predicts Real-Time Parking Availability Without Sensors

This article explains how Baidu Maps leverages AI and spatio‑temporal big‑data to predict real‑time parking space availability for sensor‑less lots, detailing the overall approach, feature engineering, grid computation, real‑time feature calculation, and a multi‑branch deep learning model validated at KDD.

AIDeep LearningParking Prediction

0 likes · 13 min read

How AI Predicts Real-Time Parking Availability Without Sensors

DataFunTalk

Nov 3, 2021 · Artificial Intelligence

Deep Learning for Time‑Series Modeling in Financial Risk Management

This article describes how a financial company leveraged deep‑learning sequence models to automatically extract features from massive time‑series data, improving risk‑assessment models and operational efficiency through a unified framework that includes data preprocessing, embedding, field and item aggregation, and end‑to‑end deployment.

AIDeep LearningModeling

0 likes · 10 min read

Deep Learning for Time‑Series Modeling in Financial Risk Management

AntTech

Oct 29, 2021 · Artificial Intelligence

Ant Insurance Technology and CASIA Win Two Tracks at MuSe2021 Multimodal Sentiment Challenge (ACM MM 2021)

The Ant Insurance Technology team, together with the Institute of Automation of the Chinese Academy of Sciences, secured first place in both the MuSe‑Wilder and MuSe‑Sent tracks of the MuSe2021 Multimodal Sentiment Challenge held at the 29th ACM International Conference on Multimedia in Chengdu, showcasing advanced multimodal AI techniques.

BiLSTMDeep LearningMuSe2021

0 likes · 4 min read

Ant Insurance Technology and CASIA Win Two Tracks at MuSe2021 Multimodal Sentiment Challenge (ACM MM 2021)

YunZhu Net Technology Team

Oct 22, 2021 · Artificial Intelligence

Deep Learning Overview and Introduction to the Lightweight Distributed Inference Engine Avior

This article reviews deep learning and AI frameworks, highlights challenges of online model serving, and presents Avior—a lightweight, distributed inference engine designed for high‑performance AI services, detailing its architecture, layer design, benchmark results, and future development plans.

AI frameworksAviorDeep Learning

0 likes · 8 min read

Deep Learning Overview and Introduction to the Lightweight Distributed Inference Engine Avior

Volcano Engine Developer Services

Oct 20, 2021 · Artificial Intelligence

How ByteDance’s AI Transforms Music Creation and Discovery on TikTok

ByteDance leverages advanced AI models such as SpectTNT, semi‑supervised music tagging transformers, language identification, chord recognition, contrastive representation learning, and source separation to power TikTok’s massive music library, enabling seamless music‑video interaction, smarter recommendations, and new creative tools for creators worldwide.

Audio ProcessingDeep Learninglanguage identification

0 likes · 10 min read

How ByteDance’s AI Transforms Music Creation and Discovery on TikTok

Douyu Streaming

Oct 20, 2021 · Artificial Intelligence

How DeepXi and MHANet Revolutionize Speech Enhancement with Multi‑Head Attention

DeepXi introduces a two‑stage deep learning framework for speech enhancement, using prior SNR estimation and MMSE gain, while the MHANet extension leverages multi‑head attention to model long‑range dependencies, with detailed training strategies, model compression to GRU, deployment via TFLite, and impressive low‑latency results.

Deep LearningGRUTFLite

0 likes · 8 min read

How DeepXi and MHANet Revolutionize Speech Enhancement with Multi‑Head Attention

DataFunTalk

Oct 16, 2021 · Artificial Intelligence

Feature Extraction and Modeling of Voice and Text Data for Post‑Loan Management

This article presents practical experiences in post‑loan management, detailing how to extract descriptive and deep‑learning features from voice recordings and textual transcripts, apply traditional signal processing, keyword and TF‑IDF methods, and build CRNN and transformer models to predict repayment behavior.

AIDeep Learningmachine learning

0 likes · 19 min read

Feature Extraction and Modeling of Voice and Text Data for Post‑Loan Management

Douyu Streaming

Oct 15, 2021 · Artificial Intelligence

How End-to-End Deep Learning Boosts Real-Time Speech Enhancement

An end‑to‑end deep‑learning framework for speech enhancement is presented, detailing dataset creation, time‑domain feature extraction, a convolutional separation network, decoding, and training strategies using SI‑SIR loss with PIT, achieving a final SI‑SIR of 13 dB.

Deep LearningPITSI-SIR

0 likes · 9 min read

How End-to-End Deep Learning Boosts Real-Time Speech Enhancement

Meituan Technology Team

Oct 14, 2021 · Artificial Intelligence

Deep Learning Advances for Click‑Through Rate Prediction in Meituan's Location‑Based Advertising

Meituan's ad team uses deep learning to handle LBS distance constraints and long‑term periodic behavior, introducing DPIN for position/context bias, an ultra‑long sequence encoder with spatiotemporal activator, dynamic candidate generation, and memory‑augmented continual learning, boosting RPM 2‑20% and enabling sub‑millisecond inference.

AdvertisingCTR predictionDeep Learning

0 likes · 29 min read

Deep Learning Advances for Click‑Through Rate Prediction in Meituan's Location‑Based Advertising

Cyber Elephant Tech Team

Oct 14, 2021 · Artificial Intelligence

Mastering OCR: From Traditional Techniques to Deep Learning Solutions

This article provides a comprehensive overview of Optical Character Recognition, covering its traditional applications, the evolution to deep learning methods, key datasets, popular tools, and practical strategies for tackling diverse OCR challenges in real-world scenarios.

CRNNComputer VisionDatasets

0 likes · 18 min read

Mastering OCR: From Traditional Techniques to Deep Learning Solutions

DataFunTalk

Oct 11, 2021 · Artificial Intelligence

Full-Chain Linkage Techniques for Alibaba Display Advertising: From Deep Learning to Set Selection

Facing diminishing deep‑learning and compute gains in Alibaba’s display‑ad pipeline, the speaker proposes a full‑chain linkage approach that combines vector‑based recall (PDM), entire‑space pre‑ranking (ESDM), and set‑selection learning‑to‑rank models (LDM, LBDM) to align upstream modules with downstream objectives, yielding 8‑10% revenue growth.

Deep Learningfull-chain optimizationmachine learning

0 likes · 28 min read

Full-Chain Linkage Techniques for Alibaba Display Advertising: From Deep Learning to Set Selection

DataFunTalk

Oct 4, 2021 · Artificial Intelligence

Exploring Multi-Objective Recommendation Algorithms for 58 Community: Cross-Domain Embedding and Online Optimization

This article details how 58 Community improved content value share, click‑through, and user retention by designing a generalized multi‑objective recommendation algorithm that leverages cross‑domain embeddings, DeepFM‑DIN models, EGES‑inspired pre‑training, and online CEM‑based parameter optimization.

CEMDeep LearningUser Retention

0 likes · 16 min read

Exploring Multi-Objective Recommendation Algorithms for 58 Community: Cross-Domain Embedding and Online Optimization

21CTO

Oct 2, 2021 · Artificial Intelligence

How PyTorch Lightning Can Make Your Deep Learning Pipeline 10× Faster

This article explains six practical techniques—parallel data loading, distributed multi‑GPU training, mixed precision, early stopping, sharded training, and inference optimizations—using PyTorch Lightning to dramatically accelerate deep‑learning pipelines, turning days‑long experiments into minute‑scale runs.

Deep LearningGPUPyTorch Lightning

0 likes · 7 min read

How PyTorch Lightning Can Make Your Deep Learning Pipeline 10× Faster

360 Smart Cloud

Sep 30, 2021 · Artificial Intelligence

Understanding Computational Graphs and Automatic Differentiation for Neural Networks

This article explains how computational graphs can represent arbitrary neural networks, describes forward and reverse propagation, details the implementation of automatic differentiation with Python and NumPy, and demonstrates building and training a multilayer fully‑connected network on the MNIST dataset using custom graph nodes and optimizers.

Computational GraphDeep LearningNeural Networks

0 likes · 29 min read

Understanding Computational Graphs and Automatic Differentiation for Neural Networks

Kuaishou Large Model

Sep 30, 2021 · Artificial Intelligence

How SnowflakeNet Revolutionizes Point Cloud Completion with Skip‑Transformer

SnowflakeNet introduces a novel Snowflake Point Deconvolution architecture combined with a Skip‑Transformer to explicitly split and refine points, enabling high‑quality reconstruction of fine local geometry in incomplete point clouds and outperforming prior methods on both dense and sparse benchmarks.

3D visionDeep LearningSkip-Transformer

0 likes · 11 min read

How SnowflakeNet Revolutionizes Point Cloud Completion with Skip‑Transformer

DataFunTalk

Sep 29, 2021 · Artificial Intelligence

Self‑Supervised Learning and Contrastive Learning for Computer Vision and OCR Applications

This article reviews self‑supervised learning techniques, common computer‑vision pretext tasks, contrastive loss functions, popular frameworks such as SimCLR, MoCo and SimSiam, and demonstrates their application to OCR captcha recognition with detailed implementation and experimental results.

Computer VisionDeep LearningOCR

0 likes · 22 min read

Self‑Supervised Learning and Contrastive Learning for Computer Vision and OCR Applications

DataFunTalk

Sep 27, 2021 · Artificial Intelligence

Transfer Learning for Financial Risk Control: Theory, Methods, and Empirical Evaluation

This article introduces the fundamentals of transfer learning, explains its theoretical foundations and formulas, and demonstrates how multi‑task learning and domain‑adaptation techniques are applied to financial risk‑control scenarios to overcome label scarcity, distribution shift, and model complexity challenges, presenting detailed experimental results and analysis.

Deep LearningModel Evaluationdomain adaptation

0 likes · 17 min read

Transfer Learning for Financial Risk Control: Theory, Methods, and Empirical Evaluation

Laiye Technology Team

Sep 24, 2021 · Artificial Intelligence

Self‑Supervised Learning and Contrastive Methods for Computer Vision and OCR Applications

This article surveys self‑supervised learning techniques for computer‑vision tasks, explains common pretext tasks and contrastive loss designs, reviews representative models such as SimCLR, MoCo, SmAV and SimSiam, and demonstrates their practical impact on a captcha‑OCR system with measurable accuracy gains.

Computer VisionDeep LearningOCR

0 likes · 23 min read

Self‑Supervised Learning and Contrastive Methods for Computer Vision and OCR Applications

Python Crawling & Data Mining

Sep 24, 2021 · Artificial Intelligence

How to Build a 3D CNN for CT Scan Classification with TensorFlow

This tutorial walks through constructing, training, and evaluating a 3D convolutional neural network in TensorFlow to classify CT scans for viral pneumonia, covering data preprocessing, dynamic learning rates, early stopping, and single‑scan prediction with full code examples.

3D CNNCT scan classificationDeep Learning

0 likes · 15 min read

How to Build a 3D CNN for CT Scan Classification with TensorFlow

Kuaishou Tech

Sep 17, 2021 · Artificial Intelligence

SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

SnowflakeNet introduces a novel Snowflake Point Deconvolution architecture combined with a Skip-Transformer to progressively split seed points, enabling high‑quality point‑cloud completion that preserves fine‑grained geometric details such as smooth surfaces, sharp edges, and corners across dense and sparse datasets.

3D reconstructionComputer VisionDeep Learning

0 likes · 10 min read

SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

Meituan Technology Team

Sep 9, 2021 · Artificial Intelligence

GPU Optimization Practices for CTR Models at Meituan

Meituan accelerates CTR model inference by fusing operators with TVM, optimizing CPU‑GPU data transfers, manually tuning high‑frequency subgraphs, and dynamically offloading workloads, achieving up to ten‑fold throughput gains on Tesla T4 GPUs while keeping latency stable and only modestly increasing beyond 128 QPS, though compilation remains slow and large‑model support needs improvement.

CTRDeep LearningGPU

0 likes · 16 min read

GPU Optimization Practices for CTR Models at Meituan

Architects Research Society

Sep 6, 2021 · Artificial Intelligence

Comparison of Deep Learning Software Frameworks

This article provides an overview of deep learning as a branch of artificial intelligence and presents detailed tables comparing numerous deep‑learning software frameworks and libraries, covering their creators, release dates, licenses, platforms, languages, APIs, and support for parallelism and hardware acceleration.

Artificial IntelligenceDeep Learningframeworks

0 likes · 8 min read

Comparison of Deep Learning Software Frameworks

360 Quality & Efficiency

Sep 3, 2021 · Artificial Intelligence

Model‑Based Audio Denoising Using Deep Learning for Device Quality Evaluation

This article presents a deep‑learning approach that transforms recorded audio into spectrograms, trains a noise‑prediction network (e.g., ResNet, U‑Net, LSTM) to estimate environmental noise, subtracts it in the frequency domain, and reconstructs a cleaner signal for more accurate audio‑device quality assessment.

Deep LearningModel TrainingSTFT

0 likes · 11 min read

Model‑Based Audio Denoising Using Deep Learning for Device Quality Evaluation

360 Smart Cloud

Aug 31, 2021 · Artificial Intelligence

Understanding Convolution, Convolutional Neural Networks, and Their Implementation in Image Processing

This article explains the mathematical concept of 2‑D convolution, demonstrates its use for image filtering with examples such as blurring and Sobel edge detection, introduces artificial neural networks and back‑propagation, and details the design, training, and performance of convolutional neural networks for tasks like Sobel filter learning and MNIST digit recognition, including full Python code examples.

CNNConvolutionDeep Learning

0 likes · 25 min read

Understanding Convolution, Convolutional Neural Networks, and Their Implementation in Image Processing

DataFunSummit

Aug 28, 2021 · Artificial Intelligence

Evolution of Alibaba’s Advertising Prediction Models: From Linear Regression to Deep Interest Evolution Networks

This article reviews the characteristics of e‑commerce personalized prediction, traces Alibaba’s advertising CTR model evolution from large‑scale logistic regression through deep learning architectures such as DIN and CrossMedia, and discusses future research directions like representation learning and white‑box modeling.

CTR predictionDeep LearningE‑commerce

0 likes · 13 min read

Evolution of Alibaba’s Advertising Prediction Models: From Linear Regression to Deep Interest Evolution Networks

dbaplus Community

Aug 28, 2021 · Artificial Intelligence

Is AI Really Intelligent? Exploring Machine Learning, Neural Networks & Deep Learning

The article demystifies AI by explaining that current artificial intelligence is merely automated computation, then walks through fundamental machine‑learning concepts such as exhaustive search, linear regression, neural‑network neurons, activation functions, network structures, training calculations, and concludes with a Python implementation of a three‑layer neural network.

AIDeep LearningNeural Networks

0 likes · 15 min read

Is AI Really Intelligent? Exploring Machine Learning, Neural Networks & Deep Learning

Python Programming Learning Circle

Aug 27, 2021 · Artificial Intelligence

An Introduction to JAX: Features, Installation, and Comparison with TensorFlow and PyTorch

This article introduces Google’s JAX library, covering its origins, core features such as automatic differentiation, JIT compilation, parallel and vectorized mapping, installation steps, code examples, and a comparative overview with TensorFlow and PyTorch for deep‑learning practitioners.

Deep LearningGPUJAX

0 likes · 11 min read

An Introduction to JAX: Features, Installation, and Comparison with TensorFlow and PyTorch

Python Programming Learning Circle

Aug 23, 2021 · Artificial Intelligence

Efficient PyTorch Training Pipeline: Tips, Profiling, and Multi‑GPU Strategies

This article presents practical strategies for building high‑performance PyTorch training pipelines, covering bottleneck identification, efficient data loading, RAM‑based datasets, profiling tools, multi‑GPU training with DataParallel and DistributedDataParallel, custom loss implementation, and hardware‑vs‑software trade‑offs to accelerate deep‑learning workloads.

Custom LossDataLoaderDeep Learning

0 likes · 13 min read

Efficient PyTorch Training Pipeline: Tips, Profiling, and Multi‑GPU Strategies

Liangxu Linux

Aug 17, 2021 · Cloud Native

How to Enable GPU Acceleration in Docker on Linux

This guide walks you through installing NVIDIA drivers, CUDA, and nvidia-docker2 on a Linux host, configuring Docker to access the GPU, and verifying the setup with commands and sample TensorFlow/PyTorch code, enabling deep‑learning workloads inside containers.

CUDADeep LearningDocker

0 likes · 7 min read

How to Enable GPU Acceleration in Docker on Linux

Alimama Tech

Aug 11, 2021 · Artificial Intelligence

Dynamic Descriptive Model: A Scalable Paradigm for High‑Quality Native Creative Generation

The Dynamic Descriptive Model (DDM) introduces a scalable pipeline that automatically harvests product assets, perceives their visual attributes, encodes designers’ expertise in an extended SVG‑based descriptive language, and generates high‑quality, native‑looking ad creatives at massive scale, delivering 5‑80 % CTR gains and tens of millions of daily outputs.

AIAdvertisingComputer Vision

0 likes · 13 min read

Dynamic Descriptive Model: A Scalable Paradigm for High‑Quality Native Creative Generation

DataFunTalk

Aug 10, 2021 · Artificial Intelligence

Practical Deep Learning Tricks: Cyclic LR, Flooding, Warmup, RAdam, Adversarial Training, Focal Loss, Dropout, Normalization, ReLU, Group Normalization, Label Smoothing, Wasserstein GAN, Skip Connections, Weight Initialization

This article presents a concise collection of practical deep‑learning techniques—including cyclic learning‑rate, flooding, warmup, RAdam, adversarial training, focal loss, dropout, various normalization methods, ReLU, group normalization, label smoothing, Wasserstein GAN, skip connections, and weight initialization—along with code snippets and references for implementation.

Deep LearningGANRegularization

0 likes · 8 min read

Practical Deep Learning Tricks: Cyclic LR, Flooding, Warmup, RAdam, Adversarial Training, Focal Loss, Dropout, Normalization, ReLU, Group Normalization, Label Smoothing, Wasserstein GAN, Skip Connections, Weight Initialization

iQIYI Technical Product Team

Aug 6, 2021 · Artificial Intelligence

I2UV-HandNet: High‑Fidelity 3D Hand Mesh Reconstruction from Monocular RGB Images

I2UV-HandNet reconstructs high-fidelity 3D hand meshes from a single RGB image using an AffineNet encoder‑decoder to predict coarse UV maps and an SRNet super‑resolution module, trained on the SuperHandScan dataset, achieving real‑time performance and state‑of‑the‑art benchmark results, and targeting integration into next‑generation VR headsets without external controllers.

3D meshComputer VisionDeep Learning

0 likes · 11 min read

I2UV-HandNet: High‑Fidelity 3D Hand Mesh Reconstruction from Monocular RGB Images

DataFunTalk

Aug 4, 2021 · Artificial Intelligence

Deep Learning Practices for Personalized Recommendation in a Cultural Artifact Auction Platform

This article presents a comprehensive case study of applying deep learning techniques—including item and user embedding, cross‑domain keyword intent modeling, and multi‑interest representation—to improve the recall stage of personalized recommendation for a cultural‑artifact auction platform, addressing unique data sparsity and diversity challenges.

Deep LearningEmbeddingcross-domain learning

0 likes · 16 min read

Deep Learning Practices for Personalized Recommendation in a Cultural Artifact Auction Platform

Baidu Geek Talk

Aug 4, 2021 · Artificial Intelligence

PaddleOCR v2.2 Release: PP-Structure for Document Layout Analysis and Table Recognition

PaddleOCR v2.2 launches PP‑Structure, a Python‑installable toolkit that combines PP‑YOLO v2 layout analysis (classifying text, title, table, image, list) with RARE‑based table recognition to extract structured content and export editable Excel files, while supporting custom training and simple command‑line use.

AIDeep LearningPP-Structure

0 likes · 8 min read

PaddleOCR v2.2 Release: PP-Structure for Document Layout Analysis and Table Recognition

TiPaiPai Technical Team

Aug 2, 2021 · Artificial Intelligence

How Attention Boosts Text Recognition: From CNN‑Seq2Seq to Multi‑Scale Models

This article explains how attention mechanisms are applied to text recognition, covering the basic CNN‑Seq2Seq‑Attention architecture, multi‑scale attention extensions, and a 2D attentional irregular scene text recognizer with detailed network components, training loss, and experimental results.

CNNComputer VisionDeep Learning

0 likes · 8 min read

How Attention Boosts Text Recognition: From CNN‑Seq2Seq to Multi‑Scale Models

ByteFE

Aug 2, 2021 · Artificial Intelligence

An Overview of Artificial Intelligence, Machine Learning, and Neural Networks

This article provides a beginner‑friendly overview of artificial intelligence, its relationship with machine learning, the four major learning paradigms—supervised, unsupervised, semi‑supervised and reinforcement learning—along with a historical sketch of neural networks, their training workflow, loss functions, back‑propagation, and parameter‑update mechanisms, while also containing a brief recruitment notice.

Artificial IntelligenceDeep LearningNeural Networks

0 likes · 18 min read

An Overview of Artificial Intelligence, Machine Learning, and Neural Networks

Ctrip Technology

Jul 29, 2021 · Artificial Intelligence

NLP Techniques for Classifying Ctrip Ticket Customer Service Conversations

This article presents the background, problem analysis, data preprocessing, modeling approaches and optimization results of applying various NLP methods—including statistical models, word embeddings, attention mechanisms and pretrained language models such as BERT—to improve the accuracy of classifying Ctrip ticket customer service dialogues.

BERTDeep LearningNLP

0 likes · 13 min read

NLP Techniques for Classifying Ctrip Ticket Customer Service Conversations

DataFunTalk

Jul 24, 2021 · Artificial Intelligence

Instant Interest Reinforcement and Extension for Taobao Detail Page Distribution

This article presents the mechanisms of Taobao’s detail‑page full‑network distribution, introducing background, scenario description, and a series of algorithmic explorations—including CIDM, DTIN, and Tri‑tower models—that leverage the main product (trigger) to reinforce users’ instant interests, improve recall, coarse‑ranking, and fine‑ranking performance, and achieve notable online metric gains.

CTRDeep LearningModeling

0 likes · 17 min read

Instant Interest Reinforcement and Extension for Taobao Detail Page Distribution

Python Programming Learning Circle

Jul 22, 2021 · Artificial Intelligence

Understanding PyTorch’s Dynamic Autograd System: Variable, Function, and Engine

This article explains how PyTorch implements its dynamic autograd graph using the core C++ classes Variable, Function, and Engine, detailing their initialization, inheritance hierarchy, code structures, and the execution flow of backward propagation.

Artificial IntelligenceAutogradDeep Learning

0 likes · 13 min read

Understanding PyTorch’s Dynamic Autograd System: Variable, Function, and Engine

DeWu Technology

Jul 18, 2021 · Artificial Intelligence

Deep Learning Techniques for Sentiment Analysis

The article explains how deep‑learning models, particularly convolutional neural networks with token‑level padding, kernel size three, and max‑pooling, can automatically classify e‑commerce product reviews into eight sentiment categories, offering scalable insight for decision‑making and paving the way for recommendation, QA, and risk‑assessment applications.

Deep LearningSentiment Analysisconvolutional neural network

0 likes · 9 min read

Deep Learning Techniques for Sentiment Analysis

DataFunTalk

Jul 17, 2021 · Artificial Intelligence

Multi-Objective Modeling for CRM Opportunity Smart Allocation: Iterative Deep Learning Solutions

This article describes the evolution of a multi‑objective deep‑learning framework for automatically assigning CRM opportunities to salespeople, detailing five model versions—from an XGBoost baseline with sample weighting to advanced PLE‑based architectures—while reporting offline and online performance gains in both call‑out and connection‑out conversion rates.

A/B testingCRMDeep Learning

0 likes · 33 min read

Multi-Objective Modeling for CRM Opportunity Smart Allocation: Iterative Deep Learning Solutions

Architects' Tech Alliance

Jul 16, 2021 · Artificial Intelligence

AI Chip Landscape: GPUs, FPGAs, and ASICs for Deep Learning

The article explains how artificial intelligence relies on algorithms, compute and data, compares engineering and simulation methods, and details the roles, architectures, performance and energy characteristics of GPUs, FPGAs, and ASICs as the primary hardware accelerators for modern deep‑learning applications.

ASICArtificial IntelligenceChip Design

0 likes · 14 min read

AI Chip Landscape: GPUs, FPGAs, and ASICs for Deep Learning

Kuaishou Tech

Jul 16, 2021 · Artificial Intelligence

Bagua: An Open‑Source Distributed Training Framework for Deep Learning

Bagua is a distributed training framework co‑developed by Kuaishou and ETH Zürich that combines algorithmic and system‑level optimizations—such as decentralized, asynchronous, and compressed communication—to achieve up to 60% higher performance than existing frameworks like PyTorch‑DDP, Horovod, and BytePS across various AI workloads.

BaguaDeep LearningDistributed Training

0 likes · 15 min read

Bagua: An Open‑Source Distributed Training Framework for Deep Learning

DataFunTalk

Jul 10, 2021 · Artificial Intelligence

Multi‑Business Ranking Modeling and Optimization in Meituan Search

This article presents Meituan's multi‑business search ranking system, describing the challenges of mixed‑business queries, the layered architecture, the evolution of multi‑business quota models (MQM‑V1/V2) and multi‑business ranking networks (MBN‑V1‑V4), experimental results, and future research directions.

Deep LearningMeituanmulti‑business modeling

0 likes · 16 min read

Multi‑Business Ranking Modeling and Optimization in Meituan Search

MaGe Linux Operations

Jul 8, 2021 · Artificial Intelligence

TensorFlow vs PyTorch 2.x: Which AI Framework Wins in 2021?

An in‑depth comparison of TensorFlow 2.x and PyTorch 1.8 highlights new features, deployment options like TensorFlow Lite and PyTorch Mobile, coding style differences, and practical guidance on choosing the right deep‑learning library for various projects and skill levels.

Deep LearningPyTorchTensorFlow

0 likes · 6 min read

TensorFlow vs PyTorch 2.x: Which AI Framework Wins in 2021?

Python Programming Learning Circle

Jul 6, 2021 · Artificial Intelligence

Understanding ResNet and Building It from Scratch with PyTorch

This article explains the motivation behind residual networks, describes the architecture of ResNet including residual blocks and skip connections, lists available Keras implementations, and provides a step‑by‑step PyTorch tutorial with complete code to construct and test ResNet‑50/101/152 models.

CNNDeep LearningPyTorch

0 likes · 10 min read

Understanding ResNet and Building It from Scratch with PyTorch

Python Programming Learning Circle

Jul 3, 2021 · Artificial Intelligence

Automatic PDF Slide Transcription Using Deep Learning OCR

This article demonstrates how to automatically convert PDF slide decks into editable markdown text by first converting each page to images, then applying a deep‑learning OCR pipeline (CTPN for detection and CRNN for recognition) with Python code examples, achieving high transcription accuracy.

Deep LearningImage ProcessingOCR

0 likes · 6 min read

Automatic PDF Slide Transcription Using Deep Learning OCR

TiPaiPai Technical Team

Jul 2, 2021 · Artificial Intelligence

How Graph Neural Networks Revolutionize Arbitrary‑Shaped Text Detection

This article reviews two recent computer‑vision approaches—DRRG and STKM—that combine CNN backbones with graph‑based relational reasoning and self‑attention to achieve state‑of‑the‑art detection of arbitrarily shaped text in images.

CNNComputer VisionDeep Learning

0 likes · 11 min read

How Graph Neural Networks Revolutionize Arbitrary‑Shaped Text Detection

TiPaiPai Technical Team

Jul 2, 2021 · Artificial Intelligence

How ContourNet and CenterNet Revolutionize Text Detection

This article explains the challenges of scene text detection and introduces two state‑of‑the‑art models, ContourNet and CenterNet, detailing their architectural innovations, loss functions, and how they overcome issues like extreme aspect ratios and anchor‑based inefficiencies.

CenterNetComputer VisionContourNet

0 likes · 7 min read

How ContourNet and CenterNet Revolutionize Text Detection

21CTO

Jun 28, 2021 · Artificial Intelligence

How Multimodal AI Detects Pornographic Videos: Image & Audio Fusion Explained

This article outlines a multimodal AI framework for detecting pornographic video content by combining image and audio analysis, detailing the challenges of visual and speech-based recognition, describing the DCNet and RANet model architectures, fusion strategies, and reporting experimental accuracy of 93.4% on a 3k test set.

AIAudio ClassificationDeep Learning

0 likes · 5 min read

How Multimodal AI Detects Pornographic Videos: Image & Audio Fusion Explained

TiPaiPai Technical Team

Jun 28, 2021 · Artificial Intelligence

How Deep Learning Unwarps Twisted Document Images: DocUNet & DewarpNet Explained

This article reviews two end‑to‑end deep‑learning approaches—DocUNet (CVPR 2018) and DewarpNet (ICCV 2019)—for correcting warped document images, detailing their network architectures, synthetic data generation, loss functions, experimental results, and the remaining challenges in document dewarping.

Computer VisionDeep LearningImage Processing

0 likes · 14 min read

How Deep Learning Unwarps Twisted Document Images: DocUNet & DewarpNet Explained

Python Crawling & Data Mining

Jun 27, 2021 · Artificial Intelligence

How SQLFlow Turns Simple SQL Queries into Powerful AI Models

SQLFlow is an open‑source platform that lets users build and run machine‑learning and deep‑learning models directly from SQL statements, lowering the barrier for business analysts to apply AI by abstracting complex pipelines into familiar database queries.

Artificial IntelligenceDeep LearningSQLFlow

0 likes · 8 min read

How SQLFlow Turns Simple SQL Queries into Powerful AI Models

Alimama Tech

Jun 24, 2021 · Artificial Intelligence

One‑Stage Training for Generative Adversarial Networks (OSGAN): Methodology and Efficiency Analysis

The OSGAN method introduced by Alibaba’s Mama team and Prof. Song Ming‑Li merges generator and discriminator updates into a single stage, cutting GAN training time by roughly 1.5‑1.7× while maintaining performance, and is validated on symmetric and asymmetric DCGANs with open‑source code.

Computer VisionDeep LearningGAN

0 likes · 10 min read

One‑Stage Training for Generative Adversarial Networks (OSGAN): Methodology and Efficiency Analysis

Tencent Advertising Technology

Jun 22, 2021 · Artificial Intelligence

Technical Insights and Solution Strategies from the Tencent Advertising Algorithm Competition – Video Ad Track

The article outlines the Tencent Advertising Algorithm Competition’s video ad challenge, details the paper submission guidelines, and shares a participant’s step‑by‑step technical approach—including baseline experiments, model re‑implementation with Paddle, multimodal feature extraction, optimizer choices, and future improvement directions—providing practical AI insights for multimedia video classification.

Deep LearningMultimodal LearningTencent competition

0 likes · 7 min read

Technical Insights and Solution Strategies from the Tencent Advertising Algorithm Competition – Video Ad Track

Baidu Geek Talk

Jun 21, 2021 · Artificial Intelligence

Detecting Pornographic Videos with Dual‑Modal AI: Images + Audio

This article presents a technical overview of a multimodal AI framework that combines image and audio analysis to identify pornographic video content, detailing model architectures, feature extraction methods, and experimental results achieving 93.4% accuracy on a 3,000‑sample test set.

Audio AnalysisDeep Learningimage recognition

0 likes · 6 min read

Detecting Pornographic Videos with Dual‑Modal AI: Images + Audio

TiPaiPai Technical Team

Jun 17, 2021 · Artificial Intelligence

From Pixels to Words: The Evolution and Challenges of Text Detection

This article traces the origins, unique difficulties, method classifications, and current advancements of scene text detection, highlighting how AI has enabled computers to read images and the ongoing research to improve accuracy, speed, and multilingual support.

AIComputer VisionDeep Learning

0 likes · 8 min read

From Pixels to Words: The Evolution and Challenges of Text Detection

JD Tech

Jun 17, 2021 · Artificial Intelligence

MTrajRec: Map-Constrained Trajectory Recovery via Seq2Seq Multi‑Task Learning

The paper introduces MTrajRec, a Seq2Seq multi‑task learning framework that simultaneously restores low‑sampling‑rate GPS trajectories to high‑sampling‑rate and aligns them to the road network, achieving more accurate and efficient trajectory recovery for downstream applications such as navigation and travel‑time estimation.

Deep LearningKDD 2021Seq2Seq

0 likes · 9 min read

MTrajRec: Map-Constrained Trajectory Recovery via Seq2Seq Multi‑Task Learning

DataFunTalk

Jun 12, 2021 · Artificial Intelligence

An Introduction to Machine Learning: Concepts, Learning Path, and Knowledge System

This article provides a comprehensive overview of machine learning, explaining core AI terminology, distinguishing statistics, statistical learning, and machine learning, outlining a three‑part learning roadmap covering mathematical foundations, algorithms, and Python programming practice, and offering curated resources for building a solid knowledge system.

AI fundamentalsDeep Learninglearning roadmap

0 likes · 8 min read

An Introduction to Machine Learning: Concepts, Learning Path, and Knowledge System

JD Tech

Jun 12, 2021 · Artificial Intelligence

DeepDualMapper: A Gated Fusion Network for Automatic Map Extraction Using Aerial Images and Trajectories

The paper presents DeepDualMapper, a gated‑fusion deep network that combines aerial imagery and vehicle trajectory data to automatically generate high‑precision maps, detailing its architecture, gated and refinement modules, and experimental validation on three city datasets.

Deep LearningTrajectory DataU-Net

0 likes · 7 min read

DeepDualMapper: A Gated Fusion Network for Automatic Map Extraction Using Aerial Images and Trajectories

Meituan Technology Team

Jun 10, 2021 · Artificial Intelligence

Deep Position-wise Interaction Network for CTR Prediction

The Meituan team introduces DPIN, a three‑module deep network that jointly models ads and their positions to mitigate position bias in CTR prediction, achieving up to 2.98% AUC improvement, 2.25% higher CTR and 2.15% RPM gains while keeping latency modest, and is applicable to broader ranking tasks.

AdvertisingCTR predictionDPIN

0 likes · 24 min read

Deep Position-wise Interaction Network for CTR Prediction

Xianyu Technology

Jun 9, 2021 · Artificial Intelligence

Applying Visual AI Techniques for Image Quality and Duplicate Detection in Xianyu Marketplace

By deploying large‑scale visual AI—including a ResNet‑101 classifier, ArcFace‑trained matching features, clustering‑based sub‑category refinement, and product‑level image indexing—Xianyu’s marketplace dramatically improves image quality, removes duplicates, enhances search relevance and feed diversity, and filters non‑compliant content.

Computer VisionDeep LearningImage Classification

0 likes · 16 min read

Applying Visual AI Techniques for Image Quality and Duplicate Detection in Xianyu Marketplace

WeChat Backend Team

Jun 7, 2021 · Artificial Intelligence

How WeChat’s TFCC Boosts Deep Learning Inference Performance Across Platforms

The TFCC framework, developed by WeChat's backend team, delivers high‑performance, easy‑to‑use, and universal deep‑learning inference by supporting numerous ONNX and TensorFlow operations, optimizing model structures, constants, and operators, and providing a versatile runtime and math library for both CPU and GPU platforms.

Deep LearningFrameworkInference

0 likes · 8 min read

How WeChat’s TFCC Boosts Deep Learning Inference Performance Across Platforms

58 Tech

Jun 4, 2021 · Artificial Intelligence

Architecture and Evolution of the 58 Intelligent Q&A Chatbot System

This article details the design, iterative development, and performance optimizations of 58's AI‑driven intelligent Q&A chatbot, covering its overall three‑layer architecture, the QABot, TaskBot, and answer‑recommendation modules, as well as dynamic strategy adjustment, caching mechanisms, and real‑world deployment results.

AIChatbotDeep Learning

0 likes · 16 min read

Architecture and Evolution of the 58 Intelligent Q&A Chatbot System

Alibaba Cloud Native

Jun 3, 2021 · Artificial Intelligence

How Weibo Boosted Deep Learning Training Speed 18× with Fluid and JindoRuntime

Weibo’s deep learning platform faced severe latency and stability issues when accessing massive small‑file datasets via a compute‑storage‑separated architecture, so the team adopted the CNCF Fluid project with JindoRuntime, implementing a distributed cache that leverages POSIX interfaces, dramatically improving data locality, reducing HDFS load, and achieving up to 18‑fold training speedups while raising success rates from 37 % to 98 %.

Data CachingDeep LearningDistributed Training

0 likes · 15 min read

How Weibo Boosted Deep Learning Training Speed 18× with Fluid and JindoRuntime

Tencent Music Tech Team

Jun 1, 2021 · Artificial Intelligence

TDQA: A No-Reference Deep Learning Based Video Quality Assessment Algorithm for Live Streaming

TDQA is a no‑reference, deep‑learning video quality assessment algorithm designed for live‑streaming, built on a large subjectively annotated dataset and an end‑to‑end architecture with fine‑tuned backbones, achieving state‑of‑the‑art accuracy and sub‑second inference for real‑time quality monitoring and pipeline optimization.

Deep LearningModel TrainingNo-Reference

0 likes · 15 min read

TDQA: A No-Reference Deep Learning Based Video Quality Assessment Algorithm for Live Streaming

DataFunTalk

May 31, 2021 · Artificial Intelligence

Intelligent Transportation Search Ranking: From Business Rules to Personalized Ranking Models

This article presents the challenges of travel‑related product search, explains why traditional rule‑based sorting is insufficient, and describes how Alibaba Flypig’s team built a deep‑learning based personalized ranking system—including architecture, model variants, experimental results, and future optimization directions—to improve conversion rates for flight and ticket searches.

AIDeep LearningRanking Models

0 likes · 9 min read

Intelligent Transportation Search Ranking: From Business Rules to Personalized Ranking Models

Architects Research Society

May 30, 2021 · Artificial Intelligence

Artificial Intelligence vs. Machine Learning: Definitions, History, and Key Differences

This article explains the origins, definitions, and evolving relationship between artificial intelligence and machine learning, highlighting their historical milestones, core concepts, and how modern applications like deep learning, neural networks, and recommendation systems illustrate their intertwined development.

AIDeep LearningDefinitions

0 likes · 8 min read

Artificial Intelligence vs. Machine Learning: Definitions, History, and Key Differences

Python Programming Learning Circle

May 29, 2021 · Artificial Intelligence

Comparing PyTorch 1.8 and TensorFlow 2.5: New Features, Use Cases, and Choosing the Right Framework

This article reviews the latest releases of PyTorch 1.8 and TensorFlow 2.5, outlining their new functionalities, ecosystem tools such as TensorFlow.js, Lite, and TFX, as well as PyTorch Mobile and Lightning, and provides guidance on selecting the most suitable framework for different deep‑learning projects.

Artificial IntelligenceDeep LearningPyTorch

0 likes · 7 min read

Comparing PyTorch 1.8 and TensorFlow 2.5: New Features, Use Cases, and Choosing the Right Framework

Kuaishou Tech

May 29, 2021 · Artificial Intelligence

Speaker-Aware Module for Single-Sample Voice Conversion (SAVC)

The paper presents a speaker‑aware module (SAM) that enables high‑quality voice conversion using only a single utterance of the target speaker, addressing the small‑data challenge in speech timbre transfer and achieving state‑of‑the‑art performance on the Aishell‑1 benchmark.

Deep LearningLPCNetSpeech synthesis

0 likes · 12 min read

Speaker-Aware Module for Single-Sample Voice Conversion (SAVC)

Cyber Elephant Tech Team

May 26, 2021 · Artificial Intelligence

Can GANs Eliminate Motion Blur? A Deep Learning Approach to Image Deblurring

This article reviews a GAN‑based deep learning method for removing motion blur from images, covering the problem definition, related work, the multi‑scale generator and discriminator architecture, loss functions, the GoPro dataset, and experimental results that demonstrate clear visual improvements.

Computer VisionDeep LearningGAN

0 likes · 11 min read

Can GANs Eliminate Motion Blur? A Deep Learning Approach to Image Deblurring

Kuaishou Tech

May 24, 2021 · Artificial Intelligence

BCNet: A Bilayer Instance Segmentation Network for Occlusion‑Aware Object Detection

The paper proposes BCNet, a lightweight bilayer instance segmentation network that explicitly models occluder and occludee relationships by treating each region of interest as two overlapping layers, achieving significant performance gains on COCO, COCOA and KINS datasets under heavy occlusion.

Computer VisionDeep Learningbilayer network

0 likes · 10 min read

BCNet: A Bilayer Instance Segmentation Network for Occlusion‑Aware Object Detection

Alimama Tech

May 20, 2021 · Artificial Intelligence

How Alibaba’s AI Powers Brand Risk Detection: Models, Data, and Results

This article details Alibaba's AliMama brand risk identification system, covering the challenges of counterfeit detection, the construction of large‑scale brand datasets, the design of classification, logo detection, and variation models, their optimization, evaluation metrics, and future directions for AI‑driven brand protection.

AIAlibabaComputer Vision

0 likes · 22 min read

Kuaishou Tech

May 17, 2021 · Industry Insights

How Kuaishou Delivered Real‑Time Deep‑Learning Voice Conversion on PC

Kuaishou becomes the first company to deploy a deep‑learning‑based real‑time voice‑conversion system on PC clients, delivering stable, natural‑sounding transformed speech with sub‑200 ms latency, and the article analyzes industry methods, technical challenges, and the four‑module architecture that made it possible.

Audio ProcessingDeep LearningIndustry Insight

0 likes · 10 min read

How Kuaishou Delivered Real‑Time Deep‑Learning Voice Conversion on PC

Tencent Tech

May 13, 2021 · Artificial Intelligence

Seeing Inside the Black Box: Visualizing Neural Network Training and Adversarial Threats

This article explains how neural networks work, walks through the step‑by‑step training process of a convolutional model, showcases vivid visualizations of each layer, and demonstrates how tiny adversarial perturbations can dramatically alter predictions, highlighting the importance of AI security.

AI securityCNN visualizationDeep Learning

0 likes · 6 min read

Seeing Inside the Black Box: Visualizing Neural Network Training and Adversarial Threats

Kuaishou Tech

May 10, 2021 · Artificial Intelligence

Semantic Image Matting: Integrating Alpha Pattern Semantics into the Matting Framework

The article presents Semantic Image Matting, a novel approach that incorporates 20 semantic Alpha pattern categories into the matting pipeline via semantic Trimap, region‑based classifiers, multi‑class discriminators, and learnable gradient loss, achieving state‑of‑the‑art results on multiple benchmarks.

Computer VisionDeep Learningalpha patterns

0 likes · 11 min read

Semantic Image Matting: Integrating Alpha Pattern Semantics into the Matting Framework

DeWu Technology

Apr 30, 2021 · Artificial Intelligence

Deep Learning Based Image Aesthetic Quality Assessment

The paper presents a deep‑learning approach that uses an ImageNet‑pretrained CNN to predict full human rating distributions for images via an Earth Mover’s Distance loss, trained on the AVA dataset, and demonstrates accurate assessment of aesthetic factors such as tone, contrast, and composition.

AVA datasetCNNDeep Learning

0 likes · 8 min read

Deep Learning Based Image Aesthetic Quality Assessment

JD Cloud Developers

Apr 30, 2021 · Artificial Intelligence

How Face Keypoint Localization Advances Under Masked Conditions: Insights from JD AI’s 3rd Competition

The JD AI Institute and ICME2021 concluded their third face keypoint localization contest, emphasizing efficient masked‑face detection to aid COVID‑19 contact tracing, attracting top universities and tech firms, expanding data scale, and tightening model efficiency constraints to push the field forward.

AI competitionComputer VisionDeep Learning

0 likes · 4 min read

How Face Keypoint Localization Advances Under Masked Conditions: Insights from JD AI’s 3rd Competition

DataFunTalk

Apr 29, 2021 · Artificial Intelligence

Path‑based Deep Network (PDN) for E‑commerce Recommendation Recall

This paper proposes a Path‑based Deep Network (PDN) that combines similarity‑index and embedding‑based retrieval paradigms to model user‑item interactions via Trigger Net and Similarity Net, achieving significant improvements in click‑through rate, GMV, and diversity on Taobao’s homepage feed.

Deep LearningEmbeddingPDN

0 likes · 21 min read

Path‑based Deep Network (PDN) for E‑commerce Recommendation Recall

Tencent Music Tech Team

Apr 26, 2021 · Artificial Intelligence

Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021

Tencent Music’s Multimedia R&D Center celebrated its first appearances at IJCNN and ICASSP 2021 by having two papers accepted—one presenting large‑scale singer recognition via deep metric learning and the other describing user‑driven audio embeddings for content‑based music recommendation—highlighting the team’s expanding expertise across diverse music‑recognition technologies and future research directions.

Audio EmbeddingDeep LearningICASSP

0 likes · 8 min read

Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021

360 Quality & Efficiency

Apr 23, 2021 · Artificial Intelligence

Deep Learning for Code Analysis: Workflow, Program Representation, Code2vec Architecture, and Limitations

This guide examines how deep learning techniques are applied to large‑scale code analysis, covering the technical workflow, program representations such as token sequences and AST paths, the code2vec architecture, its advantages, current limitations, and potential applications like code summarization and similarity detection.

AI for software engineeringDeep Learningcode analysis

0 likes · 9 min read

Deep Learning for Code Analysis: Workflow, Program Representation, Code2vec Architecture, and Limitations

360 Quality & Efficiency

Apr 16, 2021 · Artificial Intelligence

Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames

This article presents a method that replaces manual visual inspection with an automated YOLOv5‑based object detection pipeline to classify video frames as normal, colorful, or black screens, detailing data annotation, training, loss calculation, inference code, and showing a 97% accuracy improvement over ResNet.

Computer VisionDeep LearningImage Classification

0 likes · 11 min read

Applying YOLOv5 Object Detection for Black, Color, and Blank Screen Classification in Video Frames

58UXD

Apr 12, 2021 · Artificial Intelligence

How 58.com Built an AI Designer: From Smart Cutout to Intelligent Creative Platform

This article chronicles 58.com’s journey from a small brainstorming room to a full‑scale AI design platform, detailing the development of smart cutout, the BASNet segmentation model, custom loss functions, template editing, and the measurable business impact of the AI designer.

AI designBASNetComputer Vision

0 likes · 15 min read

How 58.com Built an AI Designer: From Smart Cutout to Intelligent Creative Platform

DataFunTalk

Apr 12, 2021 · Artificial Intelligence

Comprehensive Survey of Graph Neural Networks: 15 Key Review Papers and Resources

This article compiles and summarizes fifteen influential survey papers on Graph Neural Networks, covering their models, applications, datasets, benchmarks, challenges, and future directions, while providing links to the original PDFs and highlighting distinctions between small and large-scale graph learning.

Deep Learninggraph learningmachine learning

0 likes · 20 min read

Comprehensive Survey of Graph Neural Networks: 15 Key Review Papers and Resources

Youku Technology

Apr 8, 2021 · Artificial Intelligence

Champion Solution of Media AI Alibaba Entertainment Video Object Segmentation Challenge

The Youku AI team won the Media AI Alibaba Entertainment Video Object Segmentation Challenge by enhancing the STM model with a spatial‑constrained memory reader, ASPP‑HRNet refinement, ResNeSt‑101 backbone, and a multi‑stage training pipeline, while also devising an unsupervised framework that combines DetectoRS detection, HRNet mask refinement, STM‑based association, and key‑frame optimization to achieve 95.5% test score on a large, richly annotated video dataset.

Computer VisionDeep LearningSemi-supervised Learning

0 likes · 13 min read

Champion Solution of Media AI Alibaba Entertainment Video Object Segmentation Challenge

DataFunTalk

Apr 3, 2021 · Artificial Intelligence

A Survey of User Behavior Sequence Modeling for Search and Recommendation Advertising

User behavior sequence modeling, crucial for search and recommendation advertising ranking, has evolved from simple pooling to attention, RNN, capsule, and Transformer architectures, with industrial applications across e‑commerce, social, video, and music platforms, and future directions include time‑aware, multi‑dimensional, and self‑supervised approaches.

Deep LearningSequence ModelingTransformer

0 likes · 24 min read

A Survey of User Behavior Sequence Modeling for Search and Recommendation Advertising

Huawei Cloud Developer Alliance

Apr 3, 2021 · Artificial Intelligence

Can AI Bring Loved Ones Back? Exploring Digital Immortality

Amid the convergence of Qingming and Easter, this article examines how AI technologies—from voice synthesis to digital avatars—are being used to preserve and “resurrect” deceased loved ones, exploring real-world examples, technical methods, ethical dilemmas, and the future potential of digital immortality.

AIDeep Learningdigital avatars

0 likes · 9 min read

Can AI Bring Loved Ones Back? Exploring Digital Immortality