Tagged articles

1235 articles

Page 4 of 13

Feb 18, 2024 · Artificial Intelligence

How OpenAI’s Sora Turns Text into Realistic 60‑Second Videos

OpenAI’s newly unveiled Sora system can generate 60‑second, high‑quality videos from plain text prompts, leveraging a data‑driven physical engine trained on synthetic data from Unreal Engine 5, with contributions from researchers like Tim Brooks and Bill Peebles, marking a major AI video‑generation breakthrough.

Deep LearningOpenAIgenerative AI

0 likes · 6 min read

How OpenAI’s Sora Turns Text into Realistic 60‑Second Videos

DataFunSummit

Feb 9, 2024 · Artificial Intelligence

STAN: A User‑Lifecycle‑Based Multi‑Task Recommendation Model for Shopee

The article introduces STAN, a multi‑task recommendation framework that leverages user lifecycle segmentation to jointly optimize CTR, stay‑time, and CVR, detailing the business context, key challenges, solution architecture, offline and online evaluations, and future research directions.

CTRCVRDeep Learning

0 likes · 8 min read

STAN: A User‑Lifecycle‑Based Multi‑Task Recommendation Model for Shopee

Baidu Geek Talk

Feb 5, 2024 · Artificial Intelligence

Why Static Graphs Outperform Dynamic Graphs in AutoDiff: A Deep Dive

This article explains the fundamental differences between static and dynamic computation graphs, compares their memory and performance characteristics, shows how automatic differentiation works in each paradigm, and provides a step‑by‑step implementation of a toy static‑graph AutoDiff engine with Python code examples.

AutoDiffDeep LearningDynamic Graph

0 likes · 18 min read

Why Static Graphs Outperform Dynamic Graphs in AutoDiff: A Deep Dive

Rare Earth Juejin Tech Community

Feb 4, 2024 · Artificial Intelligence

Understanding Stable Diffusion Architecture and Implementing It with the Diffusers Library

This article reviews the evolution from GANs to diffusion models, explains the components of Stable Diffusion—including the CLIP text encoder, VAE, and UNet—and provides step‑by‑step Python code using HuggingFace's Diffusers library to generate images from text prompts.

AI paintingDeep LearningPython

0 likes · 12 min read

Understanding Stable Diffusion Architecture and Implementing It with the Diffusers Library

360 Smart Cloud

Jan 26, 2024 · Artificial Intelligence

Parallel Strategies for Distributed Deep Learning Training

This article reviews distributed training techniques for large deep‑learning models, covering data parallelism, model parallelism (including pipeline and tensor parallelism), gradient bucketing and accumulation, 3D parallelism, and practical implementations such as Megatron‑LM and 360AI platform optimizations.

AIData ParallelismDeep Learning

0 likes · 22 min read

Parallel Strategies for Distributed Deep Learning Training

Rare Earth Juejin Tech Community

Jan 14, 2024 · Artificial Intelligence

Understanding and Implementing LoRA (Low‑Rank Adaptation) for Model Training with PyTorch

This article explains the principle of LoRA (Low‑Rank Adaptation) for large language models, demonstrates how to decompose weight updates into low‑rank matrices, and provides a complete PyTorch implementation that fine‑tunes a small VGG‑19 network on a custom goldfish dataset.

Deep LearningLoRANeural Networks

0 likes · 11 min read

Understanding and Implementing LoRA (Low‑Rank Adaptation) for Model Training with PyTorch

DataFunSummit

Jan 12, 2024 · Artificial Intelligence

Application of Graph Neural Networks in Recommendation Systems: OPPO Business Scenario Practice

This article explains the fundamentals of graph neural networks and graph representation learning, outlines how graphs enhance recommendation systems, and details OPPO's practical implementation of a hybrid dual‑tower and graph sub‑network model to improve recall and ranking performance.

CTR predictionDeep LearningOPPO

0 likes · 19 min read

Application of Graph Neural Networks in Recommendation Systems: OPPO Business Scenario Practice

DataFunSummit

Jan 1, 2024 · Artificial Intelligence

Advances in Image and Video Enhancement, Quality Assessment, and Multimodal AI Techniques

This article reviews the latest research from Alibaba DAMO Academy on real-world image quality problems, covering spatial, temporal, and color enhancement methods, advanced quality assessment metrics, multimodal diffusion models, and future directions toward large‑model integration and lightweight deployment.

Deep LearningMOS regressionMultimodal AI

0 likes · 24 min read

Advances in Image and Video Enhancement, Quality Assessment, and Multimodal AI Techniques

Sohu Tech Products

Dec 27, 2023 · Artificial Intelligence

Analysis of LLaMA Model Architecture in the Transformers Library

This article walks through the core LLaMA implementation in HuggingFace’s Transformers library, detailing the inheritance hierarchy, configuration defaults, model initialization, embedding and stacked decoder layers, the RMSNorm‑based attention and MLP modules, and the forward pass that produces normalized hidden states.

Deep LearningModel architecturePyTorch

0 likes · 14 min read

Analysis of LLaMA Model Architecture in the Transformers Library

Huolala Tech

Dec 26, 2023 · Artificial Intelligence

How AI Powers Scalable Multilingual and Timezone Testing for Global Apps

This article explains how a deep‑learning‑driven AI platform tackles the complex challenges of multilingual and multi‑timezone testing for a rapidly expanding international app, detailing the architecture, data pipelines, model training, and the resulting efficiency, accuracy, and coverage gains.

AIDeep LearningMultilingual Testing

0 likes · 14 min read

How AI Powers Scalable Multilingual and Timezone Testing for Global Apps

Bilibili Tech

Dec 15, 2023 · Artificial Intelligence

Bilibili's AI-Powered Video Frame Interpolation: Techniques, Challenges, and Deployment

Bilibili’s AI‑driven frame‑interpolation pipeline upgrades low‑frame-rate videos to smooth high‑frame-rate 1080p playback by optimizing optical‑flow models for large motion, texture and text artifacts, pruning for speed, and deploying via the BVT SDK across on‑demand and live streams.

AIDeep LearningMultimedia

0 likes · 14 min read

Bilibili's AI-Powered Video Frame Interpolation: Techniques, Challenges, and Deployment

DataFunTalk

Dec 10, 2023 · Artificial Intelligence

PyTorch Model Training Performance Tuning Guide

This guide provides comprehensive techniques for optimizing PyTorch training performance and efficiency, covering all model types such as CNNs, RNNs, GANs, and transformers, and applicable across domains like computer vision and natural language processing, targeting AI/ML platform engineers, data engineers, backend developers, MLOps, SREs, architects, and machine learning engineers.

AIDeep LearningPyTorch

0 likes · 2 min read

PyTorch Model Training Performance Tuning Guide

DataFunSummit

Dec 9, 2023 · Artificial Intelligence

Causal Learning Paradigms: From Prior Causal Structure to Causal Discovery

This article reviews the growing interest in causal learning within machine learning, explaining what causal learning is, its advantages over purely correlational methods, and detailing two main paradigms—learning with known causal structures and learning via causal discovery—along with examples, challenges, and future directions.

Deep Learningcausal discoverycausal inference

0 likes · 12 min read

Causal Learning Paradigms: From Prior Causal Structure to Causal Discovery

Airbnb Technology Team

Dec 8, 2023 · Artificial Intelligence

Leveraging Image Aesthetics and Photo Sorting Algorithms to Enhance Airbnb Listings

Airbnb’s new computer‑vision pipeline trains a deep‑learning aesthetic model with an EMD loss to rank photos, automatically sorts new‑listing images by design and room type, and scales real‑time similarity search via HNSW‑based ANN on AWS OpenSearch, boosting click‑through, bookings, and enabling unsupervised visual recommendations.

AirbnbComputer VisionDeep Learning

0 likes · 9 min read

Leveraging Image Aesthetics and Photo Sorting Algorithms to Enhance Airbnb Listings

Rare Earth Juejin Tech Community

Dec 8, 2023 · Artificial Intelligence

Simplifying Transformer Blocks: Removing Residual Connections, LayerNorm, and Other Components without Losing Performance

A recent ETH Zurich paper shows that standard Transformer blocks can be drastically simplified by removing residual connections, LayerNorm, projection and value parameters, and even MLP sub‑block components, achieving up to 16% fewer parameters and comparable training speed and downstream performance on both GPT‑style decoders and BERT models.

AIDeep LearningLLM

0 likes · 11 min read

Simplifying Transformer Blocks: Removing Residual Connections, LayerNorm, and Other Components without Losing Performance

IT Services Circle

Dec 6, 2023 · Artificial Intelligence

AI Image Outpainting: Unexpected Transformations and How It Works

The article showcases a series of humorous and surprising AI‑generated image expansions from Douyin, explains the underlying outpainting technology, and discusses why such tools are both entertaining and useful despite occasional odd results.

AIComputer VisionDeep Learning

0 likes · 6 min read

AI Image Outpainting: Unexpected Transformations and How It Works

Huolala Tech

Nov 28, 2023 · Mobile Development

How HuoLala Built a Low‑Cost, High‑Reliability Mobile UI Automation Platform

This article details HuoLala's journey from a weekly release cycle to a cloud‑based record‑and‑replay mobile UI automation platform, covering background challenges, industry analysis, technical design—including deep‑learning based control detection, SIFT image matching, script generation, playback handling, and platform features—while demonstrating significant testing efficiency gains and future AI‑driven enhancements.

Deep LearningSIFTUI automation

0 likes · 21 min read

How HuoLala Built a Low‑Cost, High‑Reliability Mobile UI Automation Platform

dbaplus Community

Nov 27, 2023 · Artificial Intelligence

Build an Image‑Search Engine with Elasticsearch 8.x and CLIP

This guide explains how to implement reverse image search by extracting visual features with a multilingual CLIP model, storing the vectors in Elasticsearch 8.x, and using its k‑NN plugin to retrieve similar images, covering architecture, tools, code snippets, and results.

CLIPDeep Learningimage search

0 likes · 9 min read

Build an Image‑Search Engine with Elasticsearch 8.x and CLIP

JD Retail Technology

Nov 23, 2023 · Artificial Intelligence

Recent Advances in Advertising Recommendation Algorithms and Their Applications

This article reviews recent progress in advertising recommendation technologies, covering deep learning‑based ranking, sequence modeling, self‑supervised learning, online and reinforcement learning, multimodal recommendation, and fairness, and details four key breakthroughs—data‑driven incremental learning, dynamic group parameter modeling, bilateral interactive graph convolution, and a relation‑aware diffusion model for poster layout generation, along with experimental results and future challenges.

Deep LearningIncremental Learningadvertising recommendation

0 likes · 25 min read

Recent Advances in Advertising Recommendation Algorithms and Their Applications

MaGe Linux Operations

Nov 19, 2023 · Artificial Intelligence

Build and Train a Python CNN for Image & Face Recognition with TensorFlow

Learn step-by-step how to create, compile, train, evaluate, and deploy convolutional neural networks in Python using TensorFlow and Keras for general image classification and a practical face‑recognition example, complete with code snippets and data‑preprocessing techniques.

CNNDeep LearningTensorFlow

0 likes · 7 min read

Build and Train a Python CNN for Image & Face Recognition with TensorFlow

Rare Earth Juejin Tech Community

Nov 15, 2023 · Artificial Intelligence

Understanding the Transformer Architecture: Encoder, Decoder, and Attention Mechanisms

This article explains the Transformer model, comparing it with RNNs, detailing its encoder‑decoder structure, multi‑head and scaled dot‑product attention, embedding layers, feed‑forward networks, and the final linear‑softmax output, supplemented with diagrams and code examples.

Deep LearningEncoder-DecoderNeural Networks

0 likes · 10 min read

Understanding the Transformer Architecture: Encoder, Decoder, and Attention Mechanisms

Network Intelligence Research Center (NIRC)

Nov 9, 2023 · Artificial Intelligence

How Wav2Lip Achieves Accurate Speech‑Driven Lip Sync with Expert Discriminators

The article analyzes the limitations of traditional speech‑driven lip‑sync methods and explains how Wav2Lip introduces a pretrained multi‑frame expert sync discriminator, a two‑stage GAN training pipeline, and a specialized generator architecture to produce high‑quality, audio‑aligned facial videos.

Computer VisionDeep LearningGAN

0 likes · 7 min read

How Wav2Lip Achieves Accurate Speech‑Driven Lip Sync with Expert Discriminators

NetEase Media Technology Team

Nov 6, 2023 · Artificial Intelligence

Overview of Sequential Recommendation Models

The article surveys sequential recommendation models from early non-deep approaches like FPMC, through RNN-based GRU4Rec and CNN-based Caser, to Transformer-based methods such as SASRec, BERT4Rec, TiSASRec, and recent contrastive-learning techniques, recommending SASRec or its variants for production use.

Deep LearningTransformercontrastive learning

0 likes · 17 min read

Overview of Sequential Recommendation Models

21CTO

Oct 30, 2023 · Artificial Intelligence

Geoffrey Hinton Warns AI Could Take Over Earth Within Five Years – What You Need to Know

Renowned AI pioneer Geoffrey Hinton cautions that rapidly advancing artificial intelligence may surpass human control in as little as five years, highlighting self‑modifying code, the "black‑box" problem, and the urgent need for robust safety regulations.

AI SafetyAI riskDeep Learning

0 likes · 8 min read

Geoffrey Hinton Warns AI Could Take Over Earth Within Five Years – What You Need to Know

Python Programming Learning Circle

Oct 26, 2023 · Artificial Intelligence

Animal Recognition Techniques Using Deep Learning and Image Processing

This article reviews animal recognition technology, covering its background, basic principles, image‑processing, feature extraction, machine‑learning and deep‑learning methods, dataset construction, preprocessing, and feature‑selection techniques, and provides Python code examples for implementing CNNs and traditional classifiers.

Computer VisionDeep LearningImage Processing

0 likes · 18 min read

Animal Recognition Techniques Using Deep Learning and Image Processing

Rare Earth Juejin Tech Community

Oct 21, 2023 · Artificial Intelligence

Understanding LSTM, ELMO, and Transformer Models for Natural Language Processing

This article explains the principles and structures of LSTM networks, introduces the ELMO contextual embedding model with its two‑stage pre‑training and downstream usage, and provides an overview of the Transformer architecture, highlighting their roles in modern NLP tasks.

Deep LearningELMoLSTM

0 likes · 12 min read

Understanding LSTM, ELMO, and Transformer Models for Natural Language Processing

Model Perspective

Oct 18, 2023 · Fundamentals

Unlock the Power of Convolution: From Signal Smoothing to Deep Learning

This article explains the mathematical definition of convolution, walks through discrete and continuous examples, demonstrates its use in signal smoothing with moving averages, and surveys its wide-ranging applications in signal processing, communications, computer vision, seismology, medical imaging, and statistics.

ConvolutionDeep LearningImage Processing

0 likes · 7 min read

Unlock the Power of Convolution: From Signal Smoothing to Deep Learning

DataFunSummit

Oct 17, 2023 · Artificial Intelligence

DataFunSummit2023: Deep Learning‑Driven Multi‑Experiment Causal Inference and Distributed Causal Tools

The DataFunSummit2023 online conference brings together experts from Tencent and Kuaishou to present cutting‑edge research on causal inference for large‑scale A/B testing, including deep‑learning‑based multi‑experiment effect estimation, a distributed causal inference framework (Fast‑Causal‑Inference), and strategies for evaluating long‑term policy impacts.

A/B testingData ScienceDeep Learning

0 likes · 7 min read

DataFunSummit2023: Deep Learning‑Driven Multi‑Experiment Causal Inference and Distributed Causal Tools

Kuaishou Tech

Oct 17, 2023 · Artificial Intelligence

QIN: A Query‑Dominated User Interest Network for Personalized Search

The paper introduces QIN, a query‑driven user interest network that combines a Relevance Search Unit and a Fused Attention Unit to effectively leverage full‑history user behavior for personalized search, demonstrating significant performance gains in offline benchmarks and online A/B tests.

Deep Learningfused attentionpersonalized search

0 likes · 9 min read

QIN: A Query‑Dominated User Interest Network for Personalized Search

Meituan Technology Team

Oct 11, 2023 · Artificial Intelligence

Meituan Vision AI Research Highlights and Open‑Source Releases

This article compiles Meituan's cutting‑edge computer‑vision research and engineering achievements—including CVPR award‑winning segmentation, YOLOv6 releases, GPU inference optimizations, the Food2K dataset, and numerous paper digests—to provide practical insights for visual AI practitioners.

CVPRComputer VisionDeep Learning

0 likes · 11 min read

Meituan Vision AI Research Highlights and Open‑Source Releases

Architect

Oct 4, 2023 · Artificial Intelligence

How AI-Driven Digital Watermarks Achieve Robust, Invisible Protection for Video

This article examines the challenges of video copyright protection, critiques traditional visible and invisible watermark methods, and presents a deep‑learning based AI digital watermark solution that balances invisibility and robustness, detailing its network architecture, degradation layer, loss functions, block encoding, anchor calibration, and large‑scale experimental results.

AI video protectionDeep LearningRobustness

0 likes · 22 min read

How AI-Driven Digital Watermarks Achieve Robust, Invisible Protection for Video

DataFunSummit

Oct 3, 2023 · Artificial Intelligence

Time Series Forecasting for NIO Power Swap Stations: Business Background, Challenges, Algorithm Practice, and Future Outlook

This article presents a comprehensive case study of NIO's Power swap‑station ecosystem, detailing the business context, key forecasting challenges, the evolution from classical statistical models to deep‑learning architectures with specialized embeddings, and the practical outcomes and future plans for improving prediction accuracy.

Deep LearningElectric VehicleEmbedding

0 likes · 16 min read

Time Series Forecasting for NIO Power Swap Stations: Business Background, Challenges, Algorithm Practice, and Future Outlook

DataFunSummit

Sep 29, 2023 · Artificial Intelligence

Social4Rec: Enhancing Video Recommendation with Social Interest Networks

This article introduces Social4Rec, a video recommendation algorithm that tackles user cold‑start problems by extracting and integrating social interest information through coarse‑ and fine‑grained interest extractors, attention‑based fusion, and extensive offline and online experiments demonstrating significant CTR improvements.

Deep Learningattentioncold-start

0 likes · 14 min read

Social4Rec: Enhancing Video Recommendation with Social Interest Networks

Bilibili Tech

Sep 29, 2023 · Artificial Intelligence

BILIVQA: Bilibili's No-Reference Video Quality Assessment System

BILIVQA is Bilibili’s deep‑learning, no‑reference video quality assessment system that trains on a proprietary 5,000‑video UGC dataset, extracts spatial and temporal features via MobileNet‑V2 and X3D, uses mixed‑dataset regression for strong generalization, and deploys a GPU‑optimized TensorRT pipeline with percentile‑based scoring for reliable quality monitoring and downstream applications.

BILIVQADeep Learningmodel engineering

0 likes · 27 min read

BILIVQA: Bilibili's No-Reference Video Quality Assessment System

Zhuanzhuan Tech

Sep 28, 2023 · Artificial Intelligence

Evolution of Language Models and an Overview of the GPT Series

This article surveys the development of natural language processing from early rule‑based systems through statistical n‑gram models, neural language models, RNNs, LSTMs, ELMo, Transformers and BERT, and then details the architecture, training methods, advantages and limitations of the GPT‑1, GPT‑2, GPT‑3, ChatGPT and GPT‑4 models, concluding with a discussion of future challenges and references.

Deep LearningGPTNLP

0 likes · 30 min read

Evolution of Language Models and an Overview of the GPT Series

Kuaishou Large Model

Sep 27, 2023 · Artificial Intelligence

DVIS: Decoupled Framework that Sets New SOTA in Video Instance Segmentation

DVIS introduces a decoupled video instance segmentation framework that splits the task into segmentation, tracking, and refinement modules, achieving state-of-the-art performance across VIS, VPS, and VSS benchmarks while maintaining low computational overhead, and demonstrates robustness in both online and offline settings.

Computer VisionDeep LearningTransformer

0 likes · 12 min read

DVIS: Decoupled Framework that Sets New SOTA in Video Instance Segmentation

DaTaobao Tech

Sep 27, 2023 · Artificial Intelligence

FlashAttention-2: Efficient Attention Algorithm for Transformer Acceleration and AIGC Applications

FlashAttention‑2 is an IO‑aware exact attention algorithm that cuts GPU HBM traffic through tiling and recomputation, optimizes non‑matmul FLOPs, expands sequence‑parallelism and warp‑level work distribution, delivering up to 2× speedup over FlashAttention, near‑GEMM efficiency, and enabling longer‑context Transformer training and inference for AIGC with fastunet and negligible accuracy loss.

AIGCAttention optimizationDeep Learning

0 likes · 20 min read

FlashAttention-2: Efficient Attention Algorithm for Transformer Acceleration and AIGC Applications

Kuaishou Tech

Sep 26, 2023 · Artificial Intelligence

Cross-Domain Product Representation (COPE): A Large-Scale Dataset and Baseline Model for Rich‑Content E‑Commerce

The paper introduces ROPE, the first large‑scale cross‑domain product recognition dataset covering detail pages, short videos and live streams, and proposes COPE, a dual‑tower multimodal model that learns unified product embeddings using contrastive and classification losses, achieving superior retrieval and few‑shot classification performance across domains.

DatasetDeep Learningcontrastive learning

0 likes · 13 min read

Cross-Domain Product Representation (COPE): A Large-Scale Dataset and Baseline Model for Rich‑Content E‑Commerce

Kuaishou Tech

Sep 25, 2023 · Artificial Intelligence

LPR4M: A Large-Scale Multimodal Livestreaming Product Recognition Dataset and the RICE Cross‑View Semantic Alignment Model

This paper introduces LPR4M, a 4‑million‑pair multimodal dataset for livestreaming product recognition, and proposes the RICE model that combines instance‑level contrastive learning with patch‑level cross‑view semantic alignment, demonstrating state‑of‑the‑art performance on both LPR4M and MovingFashion benchmarks.

Deep Learningcross-view alignmentlivestreaming

0 likes · 19 min read

LPR4M: A Large-Scale Multimodal Livestreaming Product Recognition Dataset and the RICE Cross‑View Semantic Alignment Model

Bilibili Tech

Sep 22, 2023 · Artificial Intelligence

AI-Based Digital Watermarking for Video: Design, Training Strategies, and Engineering Deployment

The paper presents an AI‑driven invisible video watermarking system that combines a convolutional encoder/decoder with SE blocks, a simulated‑JPEG degradation layer, multi‑term loss, block‑wise processing, anchor‑based alignment and redundancy voting, achieving high visual fidelity and robust recovery after double‑compression in large‑scale platforms like Bilibili.

AIDeep Learninganchor calibration

0 likes · 21 min read

AI-Based Digital Watermarking for Video: Design, Training Strategies, and Engineering Deployment

HomeTech

Sep 21, 2023 · Artificial Intelligence

Homepage Pop‑up Recommendation System for Car Purchase Intent: Background, Feature Engineering, Model and Strategy Optimization, and Results

This article details how AutoHome's homepage pop‑up leverages precise targeting, extensive feature engineering, and multi‑stage DeepFM‑based models with attention and LHUC modules to accurately identify car‑buying users, improve vehicle‑series recommendations, and achieve a 355% conversion rate increase.

AIDeep Learningcar buying

0 likes · 7 min read

Homepage Pop‑up Recommendation System for Car Purchase Intent: Background, Feature Engineering, Model and Strategy Optimization, and Results

Ant R&D Efficiency

Sep 19, 2023 · Artificial Intelligence

From the Turing Test to GPT‑4: A Historical Overview of Chatbots and Deep Learning

From Turing’s 1950 imitation game to GPT‑4’s multimodal vision‑language capabilities, the field has evolved from simple rule‑based programs like ELIZA and PARRY, through statistical learning and the 2017 Transformer breakthrough, to large-scale generative models that achieve fluent conversation yet still grapple with hallucination and true understanding.

Chatbot HistoryDeep LearningGPT-4

0 likes · 25 min read

From the Turing Test to GPT‑4: A Historical Overview of Chatbots and Deep Learning

Alibaba Cloud Infrastructure

Sep 13, 2023 · Artificial Intelligence

Pai‑Megatron‑Patch: Design Principles, Key Features, and End‑to‑End Usage for Large Language Model Training

This article introduces the open‑source Pai‑Megatron‑Patch tool from Alibaba Cloud, explains its non‑intrusive patch architecture, enumerates supported models and features such as weight conversion, Flash‑Attention 2.0, FP8 training with Transformer Engine, and provides detailed command‑line examples for model conversion, pre‑training, supervised fine‑tuning, inference, and RLHF reinforcement learning pipelines.

Deep LearningFP8LLM

0 likes · 19 min read

Pai‑Megatron‑Patch: Design Principles, Key Features, and End‑to‑End Usage for Large Language Model Training

NetEase Cloud Music Tech Team

Sep 6, 2023 · Artificial Intelligence

Timbre‑Guided TG‑Critic and Transformer‑Based TrOMR: AI Advances in Music Evaluation

This article reviews two recent AI research papers from NetEase Cloud Music Lab: TG‑Critic, a timbre‑guided, reference‑free singing evaluation model that classifies vocal performance using only audio, and TrOMR, a Transformer‑based end‑to‑end polyphonic optical music recognition system that improves note‑sequence prediction and dataset realism.

Audio AnalysisDeep LearningMusic Evaluation

0 likes · 6 min read

Timbre‑Guided TG‑Critic and Transformer‑Based TrOMR: AI Advances in Music Evaluation

Alibaba Cloud Developer

Sep 4, 2023 · Artificial Intelligence

Hands‑On Building a Transformer from Scratch with PyTorch

This tutorial walks you through implementing a full Transformer model in PyTorch, starting from basic linear‑regression code, adding attention mechanisms, multi‑head attention, encoder‑decoder architecture, training loops, and inference, all reinforced with practical debugging tips.

Deep LearningNLPPyTorch

0 likes · 17 min read

Hands‑On Building a Transformer from Scratch with PyTorch

TAL Education Technology

Aug 31, 2023 · Artificial Intelligence

Research on Content-Based Image Retrieval Techniques

This article reviews the fundamentals, feature extraction methods, evaluation metrics, and common datasets of content‑based image retrieval (CBIR), discussing traditional low‑level features, local descriptors, unsupervised and supervised learning approaches, and recent deep‑learning models for improving retrieval performance.

CBIRDatasetsDeep Learning

0 likes · 13 min read

Research on Content-Based Image Retrieval Techniques

Network Intelligence Research Center (NIRC)

Aug 30, 2023 · Artificial Intelligence

DeepQueueNet: Scalable Network Performance Estimation with Packet‑Level Visibility

DeepQueueNet combines discrete‑event and continuous simulation with deep neural networks to deliver highly accurate, generalizable, and GPU‑scalable network performance estimates at packet‑level granularity, outperforming existing DNN‑based estimators across diverse topologies and traffic scenarios.

DESDNNDeep Learning

0 likes · 5 min read

DeepQueueNet: Scalable Network Performance Estimation with Packet‑Level Visibility

DataFunSummit

Aug 24, 2023 · Artificial Intelligence

Panoramic Indoor Layout Estimation with Vision Transformer (PanoViT)

This article introduces the PanoViT model, a vision‑transformer‑based approach for indoor layout estimation from panoramic images, covering its research background, architectural components, experimental results on public datasets, and step‑by‑step usage within ModelScope.

3D reconstructionComputer VisionDeep Learning

0 likes · 8 min read

Panoramic Indoor Layout Estimation with Vision Transformer (PanoViT)

Rare Earth Juejin Tech Community

Aug 24, 2023 · Artificial Intelligence

Neural Style Transfer with PyTorch: Theory and Implementation

This article introduces neural style transfer, explains its underlying principles using VGG19 feature extraction, content and style loss definitions, and provides a complete PyTorch implementation with code for loading images, extracting features, computing Gram matrices, and optimizing the output image.

Computer VisionDeep LearningPyTorch

0 likes · 14 min read

Neural Style Transfer with PyTorch: Theory and Implementation

Top Architect

Aug 22, 2023 · Artificial Intelligence

Face Recognition Search: Principles, Implementation Steps, and Applications

This article explains the background, core principles, preprocessing, feature extraction, matching algorithms, and practical application scenarios of face recognition search, and provides detailed reference implementations with Java and OpenCV code examples for building a complete system.

Computer VisionDeep LearningImage Processing

0 likes · 15 min read

Face Recognition Search: Principles, Implementation Steps, and Applications

Ele.me Technology

Aug 22, 2023 · Artificial Intelligence

Multi-Granularity Attention Model for Group Recommendation (MGAM)

The Multi‑Granularity Attention Model (MGAM) improves group recommendation by extracting subset, group, and superset preferences through hierarchical attention and graph neural networks, fusing them via self‑attention, and achieves state‑of‑the‑art offline results and a 1.2% online CTR lift in Alibaba’s local‑life services.

AIDeep LearningRecommendation Systems

0 likes · 18 min read

Multi-Granularity Attention Model for Group Recommendation (MGAM)

HelloTech

Aug 22, 2023 · Artificial Intelligence

AI Platform Architecture and Automation in Machine Learning

An end‑to‑end AI platform integrates feature processing, model training, deployment, and decision orchestration across offline and online layers, leveraging automated pipelines such as AutoML (feature engineering, hyper‑parameter optimization, neural architecture search) built on Ray Tune and NNI, which have already boosted CTR in real‑world advertising and aim to make every user an algorithm engineer.

AI PlatformAutoMLDeep Learning

0 likes · 8 min read

AI Platform Architecture and Automation in Machine Learning

DaTaobao Tech

Aug 21, 2023 · Artificial Intelligence

Action Sensitivity Learning for Temporal Action Localization

The paper presents Action Sensitivity Learning (ASL), a framework that models frame‑wise importance at both class‑level (via learnable Gaussian distributions) and instance‑level (using quality scores), integrates these weights into classification and regression losses, adds a contrastive InfoNCE term, and achieves state‑of‑the‑art temporal action localization performance across six benchmark datasets.

Action Sensitivity LearningComputer VisionDeep Learning

0 likes · 8 min read

Action Sensitivity Learning for Temporal Action Localization

58UXD

Aug 21, 2023 · Artificial Intelligence

Unlocking AI Painting: How Machines Master Color, Mimic Masters, and Face Creative Limits

AI painting leverages deep‑learning models to analyze and recreate color palettes, enabling machines to mimic famous artists, generate novel hues, and express emotions, while also facing challenges such as limited creativity, human‑AI interaction, and fine‑detail rendering.

AI paintingDeep Learningartistic AI

0 likes · 7 min read

Unlocking AI Painting: How Machines Master Color, Mimic Masters, and Face Creative Limits

Ele.me Technology

Aug 16, 2023 · Artificial Intelligence

Spatiotemporal-Enhanced Network for Click-Through Rate Prediction in Location‑Based Services

The paper introduces StEN, a spatiotemporal-enhanced network for CTR prediction in location-based services, combining static spatiotemporal feature activation, dynamic preference activation, and target attention, achieving state-of-the-art offline results and a 1.6% CTR lift in online tests.

Deep LearningRecommendation Systemsclick-through rate

0 likes · 19 min read

Spatiotemporal-Enhanced Network for Click-Through Rate Prediction in Location‑Based Services

Rare Earth Juejin Tech Community

Aug 16, 2023 · Artificial Intelligence

Deep Dive into OCR – Chapter 2: Development and Classification of OCR Technology

This article provides a comprehensive overview of OCR technology, detailing the evolution from traditional hand‑crafted methods to modern deep‑learning approaches, describing image preprocessing, text detection and recognition pipelines, summarizing classic machine‑learning algorithms, and presenting a practical OpenCV implementation with Python code.

Computer VisionDeep LearningOCR

0 likes · 23 min read

Deep Dive into OCR – Chapter 2: Development and Classification of OCR Technology

Rare Earth Juejin Tech Community

Aug 12, 2023 · Artificial Intelligence

An Introduction to OCR: Concepts, History, Applications, Datasets, and Technical Workflow

This article provides a comprehensive overview of Optical Character Recognition (OCR), covering its definition, historical development, classification, real‑world applications, technical pipeline, common challenges, mitigation strategies, popular datasets, model performance comparisons, and leading open‑source platforms.

Computer VisionDatasetsDeep Learning

0 likes · 16 min read

An Introduction to OCR: Concepts, History, Applications, Datasets, and Technical Workflow

Kuaishou Tech

Aug 11, 2023 · Artificial Intelligence

PEPNet: Parameter and Embedding Personalized Network for Multi‑Task Multi‑Domain Recommendation

The paper introduces PEPNet, a plug‑and‑play network that tackles the domain‑seesaw and task‑seesaw problems in multi‑scenario recommendation by using a gated personalization module (GateNU) together with embedding‑level (EPNet) and parameter‑level (PPNet) personalization, and demonstrates its superiority through extensive offline and online experiments on Kuaishou data.

Deep LearningEmbeddinggate network

0 likes · 11 min read

PEPNet: Parameter and Embedding Personalized Network for Multi‑Task Multi‑Domain Recommendation

Model Perspective

Aug 1, 2023 · Artificial Intelligence

Mastering LSTM: How Long Short-Term Memory Networks Capture Long-Term Dependencies

This article explains the challenges of processing sequential data, introduces LSTM as a solution to long‑term dependency problems in RNNs, details its cell state and gate mechanisms, showcases its architecture, and provides Python code examples for time‑series forecasting using Keras.

Deep LearningKerasLSTM

0 likes · 9 min read

Mastering LSTM: How Long Short-Term Memory Networks Capture Long-Term Dependencies

Rare Earth Juejin Tech Community

Jul 31, 2023 · Artificial Intelligence

Overview of Deep Neural Network Architectures

This article provides a comprehensive overview of deep neural network families, introducing twelve major architectures—including Feedforward, CNN, RNN, LSTM, DBN, GAN, Autoencoder, Residual, Capsule, Transformer, Attention, and Deep Reinforcement Learning—explaining their principles, structures, training methods, and offering Python/TensorFlow/PyTorch code examples.

CNNDeep LearningGAN

0 likes · 29 min read

Overview of Deep Neural Network Architectures

Rare Earth Juejin Tech Community

Jul 26, 2023 · Artificial Intelligence

Building and Training a Fully Connected Neural Network for Fashion-MNIST Classification with PyTorch

This tutorial demonstrates how to download the Fashion‑MNIST dataset, build a four‑layer fully connected neural network with PyTorch, and train it using loss functions, Adam optimizer, learning‑rate strategies, and Dropout to achieve high‑accuracy multi‑class image classification.

AdamDeep LearningDropout

0 likes · 17 min read

Building and Training a Fully Connected Neural Network for Fashion-MNIST Classification with PyTorch

Rare Earth Juejin Tech Community

Jul 24, 2023 · Artificial Intelligence

Understanding Slide-Transformer: An Efficient Local Attention Module for Vision Transformers

This article explains the Slide-Transformer paper, describing how the proposed Slide Attention replaces inefficient Im2Col‑based local attention with depthwise convolutions and a deformable shift module, achieving high efficiency, flexibility, and hardware‑agnostic performance for Vision Transformers.

Computer VisionDeep LearningDeformable Shift

0 likes · 13 min read

Understanding Slide-Transformer: An Efficient Local Attention Module for Vision Transformers

Nightwalker Tech

Jul 19, 2023 · Artificial Intelligence

Step‑by‑Step Implementation of Transformer Blocks, Attention, Normalization, Feed‑Forward, Encoder and Decoder in PyTorch

This article provides a comprehensive tutorial on building the core components of a Transformer model—including multi‑head attention, layer normalization, feed‑forward networks, encoder and decoder layers—and assembles them into a complete PyTorch implementation, supplemented with explanatory diagrams and runnable code.

DecoderDeep LearningEncoder

0 likes · 13 min read

Step‑by‑Step Implementation of Transformer Blocks, Attention, Normalization, Feed‑Forward, Encoder and Decoder in PyTorch

Test Development Learning Exchange

Jul 12, 2023 · Fundamentals

Common Python Libraries and Practical Projects: NumPy, Pandas, Matplotlib, Scikit‑learn, Requests, Beautiful Soup, Selenium, Pygame, Flask, PyTorch

This article introduces ten widely used Python libraries—NumPy, Pandas, Matplotlib, Scikit‑learn, Requests, Beautiful Soup, Selenium, Pygame, Flask, and PyTorch—each accompanied by a concise real‑world project and complete code examples to help readers understand and apply them effectively.

Data ScienceDeep LearningGame Development

0 likes · 18 min read

Common Python Libraries and Practical Projects: NumPy, Pandas, Matplotlib, Scikit‑learn, Requests, Beautiful Soup, Selenium, Pygame, Flask, PyTorch

Rare Earth Juejin Tech Community

Jul 12, 2023 · Artificial Intelligence

Comprehensive Guide to Vision Transformer (ViT): Architecture, Patch Tokenization, Embedding, Fine‑tuning, and Performance

This article provides an in‑depth, English‑language overview of Vision Transformer (ViT), covering its Transformer‑based architecture, patch‑to‑token conversion, token and position embeddings, fine‑tuning strategies such as 2‑D interpolation, experimental results versus CNNs, and the model’s broader significance for multimodal AI research.

Computer VisionDeep LearningFine‑tuning

0 likes · 25 min read

Comprehensive Guide to Vision Transformer (ViT): Architecture, Patch Tokenization, Embedding, Fine‑tuning, and Performance

Kuaishou Large Model

Jul 7, 2023 · Artificial Intelligence

How HairStep Revolutionizes Single-View 3D Hair Reconstruction

This paper introduces HairStep, a novel intermediate representation combining Strand Maps and Depth Maps, and demonstrates how it reduces domain gap and improves single‑view 3D hair reconstruction accuracy across multiple algorithms, supported by new annotated datasets (HiSa, HiDa) and fair evaluation metrics.

3D hair reconstructionComputer VisionDataset

0 likes · 11 min read

How HairStep Revolutionizes Single-View 3D Hair Reconstruction

Amap Tech

Jul 6, 2023 · Artificial Intelligence

How Gaode’s ETA System Predicts Arrival Times with Hybrid Spatio‑Temporal GCN

This article explains the architecture, data layers, prediction modules, and deep‑learning framework behind Gaode’s driving ETA service, detailing how static speed profiles, linear models, and the H‑STGCN model combine to forecast travel times and evaluate their accuracy.

Deep LearningETAGaode

0 likes · 12 min read

How Gaode’s ETA System Predicts Arrival Times with Hybrid Spatio‑Temporal GCN

DataFunSummit

Jul 1, 2023 · Artificial Intelligence

Alibaba Cloud Native Deep Learning Platform PAI‑DLC: Architecture, Features, and Future Outlook

This article introduces Alibaba Cloud's PAI‑DLC, a cloud‑native deep learning platform that integrates machine‑learning capabilities, containerized services, AI‑aware scheduling, GPU virtualization, elastic training with EasyScale, data access, and observability, and discusses its architecture, key features, and future directions.

AI PlatformCloud NativeDeep Learning

0 likes · 16 min read

Alibaba Cloud Native Deep Learning Platform PAI‑DLC: Architecture, Features, and Future Outlook

Architecture & Thinking

Jun 30, 2023 · Artificial Intelligence

How INT8 Quantization Supercharges Baidu's Search Models: Techniques and Insights

This article explores the rapid evolution of Baidu's semantic search models, the large GPU consumption they entail, and how extensive INT8 quantization, sensitivity analysis, calibration data augmentation, hyper‑parameter auto‑tuning, and advanced methods like Quantization‑Aware Training and SmoothQuant dramatically improve inference performance while preserving business metrics.

Deep LearningErnieINT8 Quantization

0 likes · 17 min read

How INT8 Quantization Supercharges Baidu's Search Models: Techniques and Insights

OPPO Kernel Craftsman

Jun 28, 2023 · Artificial Intelligence

ShaderNN 2.0: A Lightweight Mobile Deep Learning Inference Engine with OpenGL and Vulkan Support

ShaderNN 2.0 is a lightweight mobile deep learning inference engine supporting OpenGL and Vulkan, offering texture‑based zero‑copy I/O, hybrid shader implementation, and achieving significant latency and power reductions versus TensorFlow Lite and MNN, thereby enabling real‑time graphics‑AI tasks such as style transfer, denoising, super‑sampling, and Stable Diffusion on smartphones.

Deep LearningGPU shaderOpenGL

0 likes · 16 min read

ShaderNN 2.0: A Lightweight Mobile Deep Learning Inference Engine with OpenGL and Vulkan Support

Bilibili Tech

Jun 27, 2023 · Artificial Intelligence

Design and Implementation of a Real-Time Advertising Feature Platform for CTR Prediction at Bilibili

To eliminate data fragmentation, feature inconsistencies, and multi‑language implementation challenges, Bilibili built a unified real‑time advertising feature platform that aligns offline, hourly, and online pipelines via a shared C++ library and JNI, boosting CTR prediction accuracy, cutting training costs, and increasing ad revenue by over 1 %.

AdvertisingCTR predictionDeep Learning

0 likes · 11 min read

Design and Implementation of a Real-Time Advertising Feature Platform for CTR Prediction at Bilibili

Efficient Ops

Jun 26, 2023 · Artificial Intelligence

How Multimodal AI Is Revolutionizing Credit Card Fraud Detection

Amid tightening financial regulations, ICBC's software team proposes a multimodal AI anti‑fraud framework that combines image, video, and structured data to detect deep‑fake, mask, and forged‑document attacks, enriches verification with cross‑modal cues, and outlines future expansion to text and speech modalities.

AIComputer VisionDeep Learning

0 likes · 7 min read

How Multimodal AI Is Revolutionizing Credit Card Fraud Detection

Programmer DD

Jun 25, 2023 · Artificial Intelligence

How to Build Image Search with Elasticsearch 8.x and CLIP Multilingual Model

This article explains the concept of image‑based search, why it matters, and provides a step‑by‑step guide to implement image search using Elasticsearch 8.x, feature‑extraction libraries, and the multilingual CLIP‑ViT‑B‑32 model, including code snippets and architecture overview.

Deep Learningclip modelfeature extraction

0 likes · 10 min read

How to Build Image Search with Elasticsearch 8.x and CLIP Multilingual Model

Alibaba Cloud Big Data AI Platform

Jun 21, 2023 · Artificial Intelligence

How GoldMiner Boosts Deep Learning Training by Up to 12× with Elastic Data Pre‑Processing

GoldMiner, a new system from Alibaba Cloud’s PAI platform, elastically scales deep learning data pre‑processing pipelines, dramatically improving training performance up to 12.1× and GPU cluster utilization by 2.5×, and its underlying research was accepted at SIGMOD 2023.

Deep LearningGPU utilizationSIGMOD

0 likes · 5 min read

How GoldMiner Boosts Deep Learning Training by Up to 12× with Elastic Data Pre‑Processing

Kuaishou Audio & Video Technology

Jun 20, 2023 · Artificial Intelligence

How a Low‑Latency Hierarchical Fusion Network Beats Echoes in Real‑Time Calls

At ICASSP 2023, Kuaishou’s audio team presented a low‑latency hierarchical fusion network for full‑band acoustic echo cancellation, detailing its multi‑stage design, asymmetric windowing, loss functions, training strategy, and achieving second place in the non‑personalized AEC Challenge, with real‑world deployment results.

Acoustic Echo CancellationDeep LearningHierarchical Fusion Network

0 likes · 13 min read

How a Low‑Latency Hierarchical Fusion Network Beats Echoes in Real‑Time Calls

DataFunSummit

Jun 15, 2023 · Artificial Intelligence

Paraformer: An Industrial Non‑Autoregressive End‑to‑End Speech Recognition Model

This article introduces the Paraformer model released by Alibaba DAMO Academy on ModelScope, detailing its non‑autoregressive architecture, training strategies, performance on benchmark datasets, and step‑by‑step guidance for fine‑tuning and deploying the model using FunASR and ModelScope pipelines.

ASRDeep LearningModelScope

0 likes · 13 min read

Paraformer: An Industrial Non‑Autoregressive End‑to‑End Speech Recognition Model

21CTO

Jun 10, 2023 · Artificial Intelligence

How Huang Xuedong’s Team Achieved Human-Level Speech Recognition at Microsoft

The article chronicles the career of Chinese AI pioneer Huang Xuedong, detailing his education, rise at Microsoft, leadership of Azure AI, groundbreaking human‑level speech recognition breakthroughs, the engineering feats behind them—including a ten‑network model and the CNTK framework—and his recent move to Zoom.

CNTKDeep LearningMicrosoft

0 likes · 14 min read

How Huang Xuedong’s Team Achieved Human-Level Speech Recognition at Microsoft

Network Intelligence Research Center (NIRC)

Jun 9, 2023 · Artificial Intelligence

2023 NIRC PhD Graduates Reveal Cutting-Edge AI and Network Intelligence Research

In 2023 the Network Intelligent Research Center celebrated its largest PhD graduating class—seven scholars whose dissertations span deep‑vision hand‑gesture estimation, multi‑scenario network transmission, graph alignment, interactive streaming, knowledge‑defined networking, wireless body‑area networking, and more—showcasing significant AI‑driven advances and high‑impact publications.

Computer VisionDeep LearningGraph Alignment

0 likes · 30 min read

2023 NIRC PhD Graduates Reveal Cutting-Edge AI and Network Intelligence Research

Alimama Tech

May 31, 2023 · Artificial Intelligence

CF-Font: Content Fusion for Few-shot Font Generation

CF‑Font introduces a content‑fusion module that linearly mixes base‑font content features using a font‑level distance metric, combined with iterative style refinement and a projection character loss, achieving state‑of‑the‑art few‑shot Chinese font generation that outperforms prior methods by over 5% on L1 and FID and is already used to create proprietary Alibaba‑Mama fonts.

Deep Learningcontent fusionfew-shot font generation

0 likes · 10 min read

CF-Font: Content Fusion for Few-shot Font Generation

DataFunSummit

May 31, 2023 · Artificial Intelligence

Evolution of Face Detection Techniques: Datasets, Research Directions, and Future Work

This article reviews the evolution of face detection, covering the Widely‑Face dataset, major research directions such as feature fusion, label assignment, auxiliary supervision, anchor‑free methods, NAS‑based designs, summarizes key papers from S3FD to MogFace, introduces ModelScope implementations, and outlines future challenges and opportunities.

AI researchComputer VisionDatasets

0 likes · 13 min read

Evolution of Face Detection Techniques: Datasets, Research Directions, and Future Work

Architects' Tech Alliance

May 29, 2023 · Artificial Intelligence

Overview of Huawei Ascend AI Full‑Stack Architecture, CANN, and AscendCL

This article introduces Huawei's Domain‑Specific Ascend AI architecture, detailing its four‑layer full‑stack design, the five‑layer abstract and three‑layer logical structures of the CANN heterogeneous computing framework, and the AscendCL programming interface with its advantages and application scenarios.

AIAscendCANN

0 likes · 12 min read

Overview of Huawei Ascend AI Full‑Stack Architecture, CANN, and AscendCL

JD Retail Technology

May 16, 2023 · Artificial Intelligence

Deploying and Fine‑Tuning the Alpaca‑LoRA Large Language Model on a Multi‑GPU Server

This guide details the end‑to‑end process of installing GPU drivers, setting up a Python environment, deploying the open‑source Alpaca‑LoRA model, fine‑tuning it with Chinese data on a multi‑GPU server, and performing inference, while highlighting practical challenges and performance observations.

Alpaca-LoRADeep LearningFine-tuning

0 likes · 11 min read

Deploying and Fine‑Tuning the Alpaca‑LoRA Large Language Model on a Multi‑GPU Server

Architects' Tech Alliance

May 15, 2023 · Artificial Intelligence

How Transformer Powers ChatGPT: A Deep Dive into Attention and Architecture

This article provides a comprehensive analysis of the Transformer model behind ChatGPT, covering its origin, core mechanisms such as embedding, positional encoding, self‑attention, multi‑head attention, a step‑by‑step translation example, and the broader implications for AI research and industry.

AI ArchitectureAttention MechanismChatGPT

0 likes · 19 min read

How Transformer Powers ChatGPT: A Deep Dive into Attention and Architecture

Full-Stack Trendsetter

May 15, 2023 · Artificial Intelligence

Do You Really Understand ChatGPT, the Era‑Defining AI?

This article explains what ChatGPT is, how it builds on natural-language-processing and the Transformer-based GPT series, details its model-size growth, architectural enhancements, multilingual support, and walks through the tokenization-to-generation pipeline that enables coherent AI-driven conversations.

ChatGPTDeep LearningGPT-3

0 likes · 8 min read

Do You Really Understand ChatGPT, the Era‑Defining AI?

DataFunTalk

May 13, 2023 · Artificial Intelligence

Multimedia Content Understanding at Weibo: Video Summarization, Quality Assessment, OCR, Embedding, and CV‑CUDA Optimization

This article presents Weibo's comprehensive multimedia content understanding pipeline, covering video summarization techniques, quality assessment models, OCR advancements, video embedding strategies, and the performance benefits of CV‑CUDA acceleration, while highlighting real‑world applications and engineering trade‑offs.

CV-CUDAComputer VisionDeep Learning

0 likes · 32 min read

Multimedia Content Understanding at Weibo: Video Summarization, Quality Assessment, OCR, Embedding, and CV‑CUDA Optimization

Alimama Tech

May 10, 2023 · Artificial Intelligence

How AdaSparse Boosts Multi‑Scenario CTR Prediction with Adaptive Sparse Networks

AdaSparse introduces an adaptive sparse network that learns a dedicated sub‑network for each advertising scenario, balancing shared and specific knowledge while keeping computational cost low, and achieves +4.63% CTR and -3.82% CPC improvements in Alibaba’s external ad system, as validated on both public and massive production datasets.

AdvertisingCTR predictionDeep Learning

0 likes · 20 min read

How AdaSparse Boosts Multi‑Scenario CTR Prediction with Adaptive Sparse Networks

DaTaobao Tech

Apr 28, 2023 · Artificial Intelligence

Multi-Scenario Recommendation Model

The paper introduces SASS, a scenario-adaptive self-supervised recommendation model that uses contrastive pre-training and multi-layer gating to expand global samples and transfer scene-aware parameters, enabling a single model to deliver personalized recommendations across diverse Taobao ‘SuoSuo’ scenarios while mitigating data sparsity and cross-domain challenges.

AIDeep LearningRecommendation Systems

0 likes · 23 min read

21CTO

Apr 27, 2023 · Artificial Intelligence

Demystifying Transformers: A Step‑by‑Step Guide to Self‑Attention and Architecture

This article explains the Transformer model—from its encoder‑decoder structure and self‑attention mechanism to multi‑head attention, positional encoding, residual connections, training loss, and inference strategies—providing a clear, visual walkthrough for readers new to modern NLP architectures.

Deep LearningSelf-AttentionTransformer

0 likes · 21 min read

Demystifying Transformers: A Step‑by‑Step Guide to Self‑Attention and Architecture

High Availability Architecture

Apr 27, 2023 · Artificial Intelligence

Design and Optimization of Bilibili's Large‑Scale Video Duplicate Detection System

This article describes the design, algorithmic improvements, and engineering performance optimizations of Bilibili's massive video duplicate detection (collision) system, covering challenges of low‑edit‑degree reposts, two‑stage retrieval, self‑supervised feature extraction, GPU‑accelerated preprocessing, and the resulting gains in accuracy and throughput.

BilibiliDeep Learningfeature extraction

0 likes · 17 min read

Design and Optimization of Bilibili's Large‑Scale Video Duplicate Detection System

DevOps

Apr 25, 2023 · Artificial Intelligence

The Bitter Lesson: Why Brute‑Force Computation Outperforms Hand‑Crafted Knowledge in AI

Richard Sutton’s “The Bitter Lesson” argues that over the past seven decades the most powerful driver of AI progress has been general‑purpose compute and large‑scale search, which consistently surpasses methods that rely on human‑engineered knowledge across domains such as chess, Go, speech recognition, and computer vision.

AIDeep Learningbrute force

0 likes · 7 min read

The Bitter Lesson: Why Brute‑Force Computation Outperforms Hand‑Crafted Knowledge in AI

DataFunSummit

Apr 21, 2023 · Artificial Intelligence

Fine‑Tuning a ViT Image Classification Model on a Small Flower Dataset Using ModelScope

This tutorial walks through the complete process of fine‑tuning a Vision Transformer (ViT) model for 14‑class flower image classification on ModelScope, covering dataset preparation, model loading, training configuration, evaluation, and inference with practical code examples.

Deep LearningFine-tuningImage Classification

0 likes · 14 min read

Fine‑Tuning a ViT Image Classification Model on a Small Flower Dataset Using ModelScope

Bilibili Tech

Apr 21, 2023 · Artificial Intelligence

Design and Optimization of Bilibili's Large-Scale Video Duplicate Detection System

Bilibili built a massive video‑duplicate detection platform that trains a self‑supervised ResNet‑50 feature extractor, removes black borders, and uses a two‑stage ANN‑plus‑segment‑level matching pipeline accelerated by custom GPU decoding and inference, boosting duplicate rejection 7.5×, recall 3.75×, and cutting manual misses from 65 to 5 per day.

Deep LearningGPU Accelerationfeature extraction

0 likes · 19 min read

DataFunSummit

Apr 11, 2023 · Artificial Intelligence

OneFlow Coop: Joint Optimization of Dynamic‑Graph Recomputation and Memory Allocation

This article introduces OneFlow Coop, a memory‑optimization technique that jointly optimizes dynamic‑graph recomputation strategies and GPU memory allocation by analyzing existing DTR limitations, proposing recomputable in‑place, op‑guided tensor allocation, and layout‑aware eviction modules, and demonstrating superior experimental results.

Deep LearningDynamic GraphGPU Memory

0 likes · 18 min read

OneFlow Coop: Joint Optimization of Dynamic‑Graph Recomputation and Memory Allocation

Programmer DD

Apr 10, 2023 · Artificial Intelligence

Why ChatGPT Sparks Panic and What Its Real Technical Foundations Are

In this talk, AI expert Wu Jun explains why ChatGPT has caused widespread fear, traces the historical development of language models from the 1970s to today, clarifies the massive computational and data requirements, and discusses the real impact and opportunities of large‑scale AI systems.

AI hypeChatGPTDeep Learning

0 likes · 20 min read

Why ChatGPT Sparks Panic and What Its Real Technical Foundations Are

Python Crawling & Data Mining

Apr 5, 2023 · Artificial Intelligence

Why ChatGPT Works: Inside Transformers, RLHF, and AI’s Latest Breakthroughs

This article explores how ChatGPT’s remarkable abilities stem from the Transformer architecture, reinforcement learning from human feedback, and the insights presented in the fourth edition of "Artificial Intelligence: A Modern Approach," highlighting key AI milestones and technical foundations.

ChatGPTDeep LearningRLHF

0 likes · 9 min read

Why ChatGPT Works: Inside Transformers, RLHF, and AI’s Latest Breakthroughs

DataFunTalk

Apr 3, 2023 · Artificial Intelligence

Implementing RNN, LSTM, and GRU with PyTorch

This article introduces the basic architectures of recurrent neural networks (RNN), LSTM, and GRU, explains PyTorch APIs such as nn.RNN, nn.LSTM, nn.GRU, details their parameters, demonstrates code examples for building and testing these models, and provides practical insights for deep learning practitioners.

Deep LearningGRULSTM

0 likes · 9 min read

Implementing RNN, LSTM, and GRU with PyTorch

DataFunTalk

Apr 1, 2023 · Artificial Intelligence

Nvidia Meets OpenAI: Highlights from the GTC Fireside Chat on GPT‑4, Deep Learning History, and the Future of AI

In a GTC fireside chat, Nvidia CEO Jensen Huang and OpenAI co‑founder Ilya Sutskever discuss GPT‑4's multimodal advances, the evolution of deep learning from early neural networks to large‑scale models, the pivotal role of GPUs and datasets like ImageNet, and their vision for more reliable, scalable artificial intelligence.

Deep LearningGPT-4Neural Networks

0 likes · 10 min read

Nvidia Meets OpenAI: Highlights from the GTC Fireside Chat on GPT‑4, Deep Learning History, and the Future of AI

21CTO

Mar 31, 2023 · Artificial Intelligence

From Student to AI Pioneer: Ilya Sutskever’s Journey Behind ChatGPT

This article chronicles Ilya Sutskever’s two‑decade rise from a young researcher to a leading figure in artificial intelligence, highlighting his early mentorship, breakthroughs in image recognition, language translation, the founding of OpenAI, and the development of GPT and DALL‑E models.

AI researchDeep LearningGPT

0 likes · 13 min read

From Student to AI Pioneer: Ilya Sutskever’s Journey Behind ChatGPT

DataFunTalk

Mar 24, 2023 · Artificial Intelligence

Deep UPLIFT Modeling: Techniques, Challenges, and FinTech Applications

This article provides a comprehensive overview of deep UPLIFT models, covering their fundamentals, key technical challenges such as confounding bias and inductive bias, the evolution of meta‑learner and deep architectures, and practical case studies in financial technology marketing.

Deep LearningFinTechMarketing Optimization

0 likes · 14 min read

Deep UPLIFT Modeling: Techniques, Challenges, and FinTech Applications

Python Programming Learning Circle

Mar 22, 2023 · Artificial Intelligence

Overview of PyTorch 2.0 Features and New APIs

The article provides a detailed overview of PyTorch 2.0, highlighting its stable and beta features such as torch.compile, accelerated transformers, MPS backend, new quantization support, and prototype parallelism tools, while emphasizing performance improvements for dynamic shapes, distributed training, and CPU/GPU inference.

AIAccelerated TransformersDeep Learning

0 likes · 6 min read

Overview of PyTorch 2.0 Features and New APIs