Tagged articles
1235 articles
Page 4 of 13
21CTO
21CTO
Feb 18, 2024 · Artificial Intelligence

How OpenAI’s Sora Turns Text into Realistic 60‑Second Videos

OpenAI’s newly unveiled Sora system can generate 60‑second, high‑quality videos from plain text prompts, leveraging a data‑driven physical engine trained on synthetic data from Unreal Engine 5, with contributions from researchers like Tim Brooks and Bill Peebles, marking a major AI video‑generation breakthrough.

Deep LearningOpenAIgenerative AI
0 likes · 6 min read
How OpenAI’s Sora Turns Text into Realistic 60‑Second Videos
DataFunSummit
DataFunSummit
Feb 9, 2024 · Artificial Intelligence

STAN: A User‑Lifecycle‑Based Multi‑Task Recommendation Model for Shopee

The article introduces STAN, a multi‑task recommendation framework that leverages user lifecycle segmentation to jointly optimize CTR, stay‑time, and CVR, detailing the business context, key challenges, solution architecture, offline and online evaluations, and future research directions.

CTRCVRDeep Learning
0 likes · 8 min read
STAN: A User‑Lifecycle‑Based Multi‑Task Recommendation Model for Shopee
Baidu Geek Talk
Baidu Geek Talk
Feb 5, 2024 · Artificial Intelligence

Why Static Graphs Outperform Dynamic Graphs in AutoDiff: A Deep Dive

This article explains the fundamental differences between static and dynamic computation graphs, compares their memory and performance characteristics, shows how automatic differentiation works in each paradigm, and provides a step‑by‑step implementation of a toy static‑graph AutoDiff engine with Python code examples.

AutoDiffDeep LearningDynamic Graph
0 likes · 18 min read
Why Static Graphs Outperform Dynamic Graphs in AutoDiff: A Deep Dive
360 Smart Cloud
360 Smart Cloud
Jan 26, 2024 · Artificial Intelligence

Parallel Strategies for Distributed Deep Learning Training

This article reviews distributed training techniques for large deep‑learning models, covering data parallelism, model parallelism (including pipeline and tensor parallelism), gradient bucketing and accumulation, 3D parallelism, and practical implementations such as Megatron‑LM and 360AI platform optimizations.

AIData ParallelismDeep Learning
0 likes · 22 min read
Parallel Strategies for Distributed Deep Learning Training
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 14, 2024 · Artificial Intelligence

Understanding and Implementing LoRA (Low‑Rank Adaptation) for Model Training with PyTorch

This article explains the principle of LoRA (Low‑Rank Adaptation) for large language models, demonstrates how to decompose weight updates into low‑rank matrices, and provides a complete PyTorch implementation that fine‑tunes a small VGG‑19 network on a custom goldfish dataset.

Deep LearningLoRANeural Networks
0 likes · 11 min read
Understanding and Implementing LoRA (Low‑Rank Adaptation) for Model Training with PyTorch
DataFunSummit
DataFunSummit
Jan 1, 2024 · Artificial Intelligence

Advances in Image and Video Enhancement, Quality Assessment, and Multimodal AI Techniques

This article reviews the latest research from Alibaba DAMO Academy on real-world image quality problems, covering spatial, temporal, and color enhancement methods, advanced quality assessment metrics, multimodal diffusion models, and future directions toward large‑model integration and lightweight deployment.

Deep LearningMOS regressionMultimodal AI
0 likes · 24 min read
Advances in Image and Video Enhancement, Quality Assessment, and Multimodal AI Techniques
Sohu Tech Products
Sohu Tech Products
Dec 27, 2023 · Artificial Intelligence

Analysis of LLaMA Model Architecture in the Transformers Library

This article walks through the core LLaMA implementation in HuggingFace’s Transformers library, detailing the inheritance hierarchy, configuration defaults, model initialization, embedding and stacked decoder layers, the RMSNorm‑based attention and MLP modules, and the forward pass that produces normalized hidden states.

Deep LearningModel architecturePyTorch
0 likes · 14 min read
Analysis of LLaMA Model Architecture in the Transformers Library
Huolala Tech
Huolala Tech
Dec 26, 2023 · Artificial Intelligence

How AI Powers Scalable Multilingual and Timezone Testing for Global Apps

This article explains how a deep‑learning‑driven AI platform tackles the complex challenges of multilingual and multi‑timezone testing for a rapidly expanding international app, detailing the architecture, data pipelines, model training, and the resulting efficiency, accuracy, and coverage gains.

AIDeep LearningMultilingual Testing
0 likes · 14 min read
How AI Powers Scalable Multilingual and Timezone Testing for Global Apps
DataFunTalk
DataFunTalk
Dec 10, 2023 · Artificial Intelligence

PyTorch Model Training Performance Tuning Guide

This guide provides comprehensive techniques for optimizing PyTorch training performance and efficiency, covering all model types such as CNNs, RNNs, GANs, and transformers, and applicable across domains like computer vision and natural language processing, targeting AI/ML platform engineers, data engineers, backend developers, MLOps, SREs, architects, and machine learning engineers.

AIDeep LearningPyTorch
0 likes · 2 min read
PyTorch Model Training Performance Tuning Guide
DataFunSummit
DataFunSummit
Dec 9, 2023 · Artificial Intelligence

Causal Learning Paradigms: From Prior Causal Structure to Causal Discovery

This article reviews the growing interest in causal learning within machine learning, explaining what causal learning is, its advantages over purely correlational methods, and detailing two main paradigms—learning with known causal structures and learning via causal discovery—along with examples, challenges, and future directions.

Deep Learningcausal discoverycausal inference
0 likes · 12 min read
Causal Learning Paradigms: From Prior Causal Structure to Causal Discovery
Airbnb Technology Team
Airbnb Technology Team
Dec 8, 2023 · Artificial Intelligence

Leveraging Image Aesthetics and Photo Sorting Algorithms to Enhance Airbnb Listings

Airbnb’s new computer‑vision pipeline trains a deep‑learning aesthetic model with an EMD loss to rank photos, automatically sorts new‑listing images by design and room type, and scales real‑time similarity search via HNSW‑based ANN on AWS OpenSearch, boosting click‑through, bookings, and enabling unsupervised visual recommendations.

AirbnbComputer VisionDeep Learning
0 likes · 9 min read
Leveraging Image Aesthetics and Photo Sorting Algorithms to Enhance Airbnb Listings
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 8, 2023 · Artificial Intelligence

Simplifying Transformer Blocks: Removing Residual Connections, LayerNorm, and Other Components without Losing Performance

A recent ETH Zurich paper shows that standard Transformer blocks can be drastically simplified by removing residual connections, LayerNorm, projection and value parameters, and even MLP sub‑block components, achieving up to 16% fewer parameters and comparable training speed and downstream performance on both GPT‑style decoders and BERT models.

AIDeep LearningLLM
0 likes · 11 min read
Simplifying Transformer Blocks: Removing Residual Connections, LayerNorm, and Other Components without Losing Performance
IT Services Circle
IT Services Circle
Dec 6, 2023 · Artificial Intelligence

AI Image Outpainting: Unexpected Transformations and How It Works

The article showcases a series of humorous and surprising AI‑generated image expansions from Douyin, explains the underlying outpainting technology, and discusses why such tools are both entertaining and useful despite occasional odd results.

AIComputer VisionDeep Learning
0 likes · 6 min read
AI Image Outpainting: Unexpected Transformations and How It Works
Huolala Tech
Huolala Tech
Nov 28, 2023 · Mobile Development

How HuoLala Built a Low‑Cost, High‑Reliability Mobile UI Automation Platform

This article details HuoLala's journey from a weekly release cycle to a cloud‑based record‑and‑replay mobile UI automation platform, covering background challenges, industry analysis, technical design—including deep‑learning based control detection, SIFT image matching, script generation, playback handling, and platform features—while demonstrating significant testing efficiency gains and future AI‑driven enhancements.

Deep LearningSIFTUI automation
0 likes · 21 min read
How HuoLala Built a Low‑Cost, High‑Reliability Mobile UI Automation Platform
dbaplus Community
dbaplus Community
Nov 27, 2023 · Artificial Intelligence

Build an Image‑Search Engine with Elasticsearch 8.x and CLIP

This guide explains how to implement reverse image search by extracting visual features with a multilingual CLIP model, storing the vectors in Elasticsearch 8.x, and using its k‑NN plugin to retrieve similar images, covering architecture, tools, code snippets, and results.

CLIPDeep Learningimage search
0 likes · 9 min read
Build an Image‑Search Engine with Elasticsearch 8.x and CLIP
JD Retail Technology
JD Retail Technology
Nov 23, 2023 · Artificial Intelligence

Recent Advances in Advertising Recommendation Algorithms and Their Applications

This article reviews recent progress in advertising recommendation technologies, covering deep learning‑based ranking, sequence modeling, self‑supervised learning, online and reinforcement learning, multimodal recommendation, and fairness, and details four key breakthroughs—data‑driven incremental learning, dynamic group parameter modeling, bilateral interactive graph convolution, and a relation‑aware diffusion model for poster layout generation, along with experimental results and future challenges.

Deep LearningIncremental Learningadvertising recommendation
0 likes · 25 min read
Recent Advances in Advertising Recommendation Algorithms and Their Applications
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Nov 15, 2023 · Artificial Intelligence

Understanding the Transformer Architecture: Encoder, Decoder, and Attention Mechanisms

This article explains the Transformer model, comparing it with RNNs, detailing its encoder‑decoder structure, multi‑head and scaled dot‑product attention, embedding layers, feed‑forward networks, and the final linear‑softmax output, supplemented with diagrams and code examples.

Deep LearningEncoder-DecoderNeural Networks
0 likes · 10 min read
Understanding the Transformer Architecture: Encoder, Decoder, and Attention Mechanisms
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Nov 9, 2023 · Artificial Intelligence

How Wav2Lip Achieves Accurate Speech‑Driven Lip Sync with Expert Discriminators

The article analyzes the limitations of traditional speech‑driven lip‑sync methods and explains how Wav2Lip introduces a pretrained multi‑frame expert sync discriminator, a two‑stage GAN training pipeline, and a specialized generator architecture to produce high‑quality, audio‑aligned facial videos.

Computer VisionDeep LearningGAN
0 likes · 7 min read
How Wav2Lip Achieves Accurate Speech‑Driven Lip Sync with Expert Discriminators
NetEase Media Technology Team
NetEase Media Technology Team
Nov 6, 2023 · Artificial Intelligence

Overview of Sequential Recommendation Models

The article surveys sequential recommendation models from early non-deep approaches like FPMC, through RNN-based GRU4Rec and CNN-based Caser, to Transformer-based methods such as SASRec, BERT4Rec, TiSASRec, and recent contrastive-learning techniques, recommending SASRec or its variants for production use.

Deep LearningTransformercontrastive learning
0 likes · 17 min read
Overview of Sequential Recommendation Models
Python Programming Learning Circle
Python Programming Learning Circle
Oct 26, 2023 · Artificial Intelligence

Animal Recognition Techniques Using Deep Learning and Image Processing

This article reviews animal recognition technology, covering its background, basic principles, image‑processing, feature extraction, machine‑learning and deep‑learning methods, dataset construction, preprocessing, and feature‑selection techniques, and provides Python code examples for implementing CNNs and traditional classifiers.

Computer VisionDeep LearningImage Processing
0 likes · 18 min read
Animal Recognition Techniques Using Deep Learning and Image Processing
Model Perspective
Model Perspective
Oct 18, 2023 · Fundamentals

Unlock the Power of Convolution: From Signal Smoothing to Deep Learning

This article explains the mathematical definition of convolution, walks through discrete and continuous examples, demonstrates its use in signal smoothing with moving averages, and surveys its wide-ranging applications in signal processing, communications, computer vision, seismology, medical imaging, and statistics.

ConvolutionDeep LearningImage Processing
0 likes · 7 min read
Unlock the Power of Convolution: From Signal Smoothing to Deep Learning
DataFunSummit
DataFunSummit
Oct 17, 2023 · Artificial Intelligence

DataFunSummit2023: Deep Learning‑Driven Multi‑Experiment Causal Inference and Distributed Causal Tools

The DataFunSummit2023 online conference brings together experts from Tencent and Kuaishou to present cutting‑edge research on causal inference for large‑scale A/B testing, including deep‑learning‑based multi‑experiment effect estimation, a distributed causal inference framework (Fast‑Causal‑Inference), and strategies for evaluating long‑term policy impacts.

A/B testingData ScienceDeep Learning
0 likes · 7 min read
DataFunSummit2023: Deep Learning‑Driven Multi‑Experiment Causal Inference and Distributed Causal Tools
Kuaishou Tech
Kuaishou Tech
Oct 17, 2023 · Artificial Intelligence

QIN: A Query‑Dominated User Interest Network for Personalized Search

The paper introduces QIN, a query‑driven user interest network that combines a Relevance Search Unit and a Fused Attention Unit to effectively leverage full‑history user behavior for personalized search, demonstrating significant performance gains in offline benchmarks and online A/B tests.

Deep Learningfused attentionpersonalized search
0 likes · 9 min read
QIN: A Query‑Dominated User Interest Network for Personalized Search
Meituan Technology Team
Meituan Technology Team
Oct 11, 2023 · Artificial Intelligence

Meituan Vision AI Research Highlights and Open‑Source Releases

This article compiles Meituan's cutting‑edge computer‑vision research and engineering achievements—including CVPR award‑winning segmentation, YOLOv6 releases, GPU inference optimizations, the Food2K dataset, and numerous paper digests—to provide practical insights for visual AI practitioners.

CVPRComputer VisionDeep Learning
0 likes · 11 min read
Meituan Vision AI Research Highlights and Open‑Source Releases
Architect
Architect
Oct 4, 2023 · Artificial Intelligence

How AI-Driven Digital Watermarks Achieve Robust, Invisible Protection for Video

This article examines the challenges of video copyright protection, critiques traditional visible and invisible watermark methods, and presents a deep‑learning based AI digital watermark solution that balances invisibility and robustness, detailing its network architecture, degradation layer, loss functions, block encoding, anchor calibration, and large‑scale experimental results.

AI video protectionDeep LearningRobustness
0 likes · 22 min read
How AI-Driven Digital Watermarks Achieve Robust, Invisible Protection for Video
DataFunSummit
DataFunSummit
Oct 3, 2023 · Artificial Intelligence

Time Series Forecasting for NIO Power Swap Stations: Business Background, Challenges, Algorithm Practice, and Future Outlook

This article presents a comprehensive case study of NIO's Power swap‑station ecosystem, detailing the business context, key forecasting challenges, the evolution from classical statistical models to deep‑learning architectures with specialized embeddings, and the practical outcomes and future plans for improving prediction accuracy.

Deep LearningElectric VehicleEmbedding
0 likes · 16 min read
Time Series Forecasting for NIO Power Swap Stations: Business Background, Challenges, Algorithm Practice, and Future Outlook
DataFunSummit
DataFunSummit
Sep 29, 2023 · Artificial Intelligence

Social4Rec: Enhancing Video Recommendation with Social Interest Networks

This article introduces Social4Rec, a video recommendation algorithm that tackles user cold‑start problems by extracting and integrating social interest information through coarse‑ and fine‑grained interest extractors, attention‑based fusion, and extensive offline and online experiments demonstrating significant CTR improvements.

Deep Learningattentioncold-start
0 likes · 14 min read
Social4Rec: Enhancing Video Recommendation with Social Interest Networks
Bilibili Tech
Bilibili Tech
Sep 29, 2023 · Artificial Intelligence

BILIVQA: Bilibili's No-Reference Video Quality Assessment System

BILIVQA is Bilibili’s deep‑learning, no‑reference video quality assessment system that trains on a proprietary 5,000‑video UGC dataset, extracts spatial and temporal features via MobileNet‑V2 and X3D, uses mixed‑dataset regression for strong generalization, and deploys a GPU‑optimized TensorRT pipeline with percentile‑based scoring for reliable quality monitoring and downstream applications.

BILIVQADeep Learningmodel engineering
0 likes · 27 min read
BILIVQA: Bilibili's No-Reference Video Quality Assessment System
Zhuanzhuan Tech
Zhuanzhuan Tech
Sep 28, 2023 · Artificial Intelligence

Evolution of Language Models and an Overview of the GPT Series

This article surveys the development of natural language processing from early rule‑based systems through statistical n‑gram models, neural language models, RNNs, LSTMs, ELMo, Transformers and BERT, and then details the architecture, training methods, advantages and limitations of the GPT‑1, GPT‑2, GPT‑3, ChatGPT and GPT‑4 models, concluding with a discussion of future challenges and references.

Deep LearningGPTNLP
0 likes · 30 min read
Evolution of Language Models and an Overview of the GPT Series
Kuaishou Large Model
Kuaishou Large Model
Sep 27, 2023 · Artificial Intelligence

DVIS: Decoupled Framework that Sets New SOTA in Video Instance Segmentation

DVIS introduces a decoupled video instance segmentation framework that splits the task into segmentation, tracking, and refinement modules, achieving state-of-the-art performance across VIS, VPS, and VSS benchmarks while maintaining low computational overhead, and demonstrates robustness in both online and offline settings.

Computer VisionDeep LearningTransformer
0 likes · 12 min read
DVIS: Decoupled Framework that Sets New SOTA in Video Instance Segmentation
DaTaobao Tech
DaTaobao Tech
Sep 27, 2023 · Artificial Intelligence

FlashAttention-2: Efficient Attention Algorithm for Transformer Acceleration and AIGC Applications

FlashAttention‑2 is an IO‑aware exact attention algorithm that cuts GPU HBM traffic through tiling and recomputation, optimizes non‑matmul FLOPs, expands sequence‑parallelism and warp‑level work distribution, delivering up to 2× speedup over FlashAttention, near‑GEMM efficiency, and enabling longer‑context Transformer training and inference for AIGC with fastunet and negligible accuracy loss.

AIGCAttention optimizationDeep Learning
0 likes · 20 min read
FlashAttention-2: Efficient Attention Algorithm for Transformer Acceleration and AIGC Applications
Kuaishou Tech
Kuaishou Tech
Sep 26, 2023 · Artificial Intelligence

Cross-Domain Product Representation (COPE): A Large-Scale Dataset and Baseline Model for Rich‑Content E‑Commerce

The paper introduces ROPE, the first large‑scale cross‑domain product recognition dataset covering detail pages, short videos and live streams, and proposes COPE, a dual‑tower multimodal model that learns unified product embeddings using contrastive and classification losses, achieving superior retrieval and few‑shot classification performance across domains.

DatasetDeep Learningcontrastive learning
0 likes · 13 min read
Cross-Domain Product Representation (COPE): A Large-Scale Dataset and Baseline Model for Rich‑Content E‑Commerce
Kuaishou Tech
Kuaishou Tech
Sep 25, 2023 · Artificial Intelligence

LPR4M: A Large-Scale Multimodal Livestreaming Product Recognition Dataset and the RICE Cross‑View Semantic Alignment Model

This paper introduces LPR4M, a 4‑million‑pair multimodal dataset for livestreaming product recognition, and proposes the RICE model that combines instance‑level contrastive learning with patch‑level cross‑view semantic alignment, demonstrating state‑of‑the‑art performance on both LPR4M and MovingFashion benchmarks.

Deep Learningcross-view alignmentlivestreaming
0 likes · 19 min read
LPR4M: A Large-Scale Multimodal Livestreaming Product Recognition Dataset and the RICE Cross‑View Semantic Alignment Model
Bilibili Tech
Bilibili Tech
Sep 22, 2023 · Artificial Intelligence

AI-Based Digital Watermarking for Video: Design, Training Strategies, and Engineering Deployment

The paper presents an AI‑driven invisible video watermarking system that combines a convolutional encoder/decoder with SE blocks, a simulated‑JPEG degradation layer, multi‑term loss, block‑wise processing, anchor‑based alignment and redundancy voting, achieving high visual fidelity and robust recovery after double‑compression in large‑scale platforms like Bilibili.

AIDeep Learninganchor calibration
0 likes · 21 min read
AI-Based Digital Watermarking for Video: Design, Training Strategies, and Engineering Deployment
HomeTech
HomeTech
Sep 21, 2023 · Artificial Intelligence

Homepage Pop‑up Recommendation System for Car Purchase Intent: Background, Feature Engineering, Model and Strategy Optimization, and Results

This article details how AutoHome's homepage pop‑up leverages precise targeting, extensive feature engineering, and multi‑stage DeepFM‑based models with attention and LHUC modules to accurately identify car‑buying users, improve vehicle‑series recommendations, and achieve a 355% conversion rate increase.

AIDeep Learningcar buying
0 likes · 7 min read
Homepage Pop‑up Recommendation System for Car Purchase Intent: Background, Feature Engineering, Model and Strategy Optimization, and Results
Ant R&D Efficiency
Ant R&D Efficiency
Sep 19, 2023 · Artificial Intelligence

From the Turing Test to GPT‑4: A Historical Overview of Chatbots and Deep Learning

From Turing’s 1950 imitation game to GPT‑4’s multimodal vision‑language capabilities, the field has evolved from simple rule‑based programs like ELIZA and PARRY, through statistical learning and the 2017 Transformer breakthrough, to large-scale generative models that achieve fluent conversation yet still grapple with hallucination and true understanding.

Chatbot HistoryDeep LearningGPT-4
0 likes · 25 min read
From the Turing Test to GPT‑4: A Historical Overview of Chatbots and Deep Learning
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Sep 13, 2023 · Artificial Intelligence

Pai‑Megatron‑Patch: Design Principles, Key Features, and End‑to‑End Usage for Large Language Model Training

This article introduces the open‑source Pai‑Megatron‑Patch tool from Alibaba Cloud, explains its non‑intrusive patch architecture, enumerates supported models and features such as weight conversion, Flash‑Attention 2.0, FP8 training with Transformer Engine, and provides detailed command‑line examples for model conversion, pre‑training, supervised fine‑tuning, inference, and RLHF reinforcement learning pipelines.

Deep LearningFP8LLM
0 likes · 19 min read
Pai‑Megatron‑Patch: Design Principles, Key Features, and End‑to‑End Usage for Large Language Model Training
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Sep 6, 2023 · Artificial Intelligence

Timbre‑Guided TG‑Critic and Transformer‑Based TrOMR: AI Advances in Music Evaluation

This article reviews two recent AI research papers from NetEase Cloud Music Lab: TG‑Critic, a timbre‑guided, reference‑free singing evaluation model that classifies vocal performance using only audio, and TrOMR, a Transformer‑based end‑to‑end polyphonic optical music recognition system that improves note‑sequence prediction and dataset realism.

Audio AnalysisDeep LearningMusic Evaluation
0 likes · 6 min read
Timbre‑Guided TG‑Critic and Transformer‑Based TrOMR: AI Advances in Music Evaluation
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 4, 2023 · Artificial Intelligence

Hands‑On Building a Transformer from Scratch with PyTorch

This tutorial walks you through implementing a full Transformer model in PyTorch, starting from basic linear‑regression code, adding attention mechanisms, multi‑head attention, encoder‑decoder architecture, training loops, and inference, all reinforced with practical debugging tips.

Deep LearningNLPPyTorch
0 likes · 17 min read
Hands‑On Building a Transformer from Scratch with PyTorch
TAL Education Technology
TAL Education Technology
Aug 31, 2023 · Artificial Intelligence

Research on Content-Based Image Retrieval Techniques

This article reviews the fundamentals, feature extraction methods, evaluation metrics, and common datasets of content‑based image retrieval (CBIR), discussing traditional low‑level features, local descriptors, unsupervised and supervised learning approaches, and recent deep‑learning models for improving retrieval performance.

CBIRDatasetsDeep Learning
0 likes · 13 min read
Research on Content-Based Image Retrieval Techniques
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Aug 30, 2023 · Artificial Intelligence

DeepQueueNet: Scalable Network Performance Estimation with Packet‑Level Visibility

DeepQueueNet combines discrete‑event and continuous simulation with deep neural networks to deliver highly accurate, generalizable, and GPU‑scalable network performance estimates at packet‑level granularity, outperforming existing DNN‑based estimators across diverse topologies and traffic scenarios.

DESDNNDeep Learning
0 likes · 5 min read
DeepQueueNet: Scalable Network Performance Estimation with Packet‑Level Visibility
DataFunSummit
DataFunSummit
Aug 24, 2023 · Artificial Intelligence

Panoramic Indoor Layout Estimation with Vision Transformer (PanoViT)

This article introduces the PanoViT model, a vision‑transformer‑based approach for indoor layout estimation from panoramic images, covering its research background, architectural components, experimental results on public datasets, and step‑by‑step usage within ModelScope.

3D reconstructionComputer VisionDeep Learning
0 likes · 8 min read
Panoramic Indoor Layout Estimation with Vision Transformer (PanoViT)
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 24, 2023 · Artificial Intelligence

Neural Style Transfer with PyTorch: Theory and Implementation

This article introduces neural style transfer, explains its underlying principles using VGG19 feature extraction, content and style loss definitions, and provides a complete PyTorch implementation with code for loading images, extracting features, computing Gram matrices, and optimizing the output image.

Computer VisionDeep LearningPyTorch
0 likes · 14 min read
Neural Style Transfer with PyTorch: Theory and Implementation
Top Architect
Top Architect
Aug 22, 2023 · Artificial Intelligence

Face Recognition Search: Principles, Implementation Steps, and Applications

This article explains the background, core principles, preprocessing, feature extraction, matching algorithms, and practical application scenarios of face recognition search, and provides detailed reference implementations with Java and OpenCV code examples for building a complete system.

Computer VisionDeep LearningImage Processing
0 likes · 15 min read
Face Recognition Search: Principles, Implementation Steps, and Applications
Ele.me Technology
Ele.me Technology
Aug 22, 2023 · Artificial Intelligence

Multi-Granularity Attention Model for Group Recommendation (MGAM)

The Multi‑Granularity Attention Model (MGAM) improves group recommendation by extracting subset, group, and superset preferences through hierarchical attention and graph neural networks, fusing them via self‑attention, and achieves state‑of‑the‑art offline results and a 1.2% online CTR lift in Alibaba’s local‑life services.

AIDeep LearningRecommendation Systems
0 likes · 18 min read
Multi-Granularity Attention Model for Group Recommendation (MGAM)
HelloTech
HelloTech
Aug 22, 2023 · Artificial Intelligence

AI Platform Architecture and Automation in Machine Learning

An end‑to‑end AI platform integrates feature processing, model training, deployment, and decision orchestration across offline and online layers, leveraging automated pipelines such as AutoML (feature engineering, hyper‑parameter optimization, neural architecture search) built on Ray Tune and NNI, which have already boosted CTR in real‑world advertising and aim to make every user an algorithm engineer.

AI PlatformAutoMLDeep Learning
0 likes · 8 min read
AI Platform Architecture and Automation in Machine Learning
DaTaobao Tech
DaTaobao Tech
Aug 21, 2023 · Artificial Intelligence

Action Sensitivity Learning for Temporal Action Localization

The paper presents Action Sensitivity Learning (ASL), a framework that models frame‑wise importance at both class‑level (via learnable Gaussian distributions) and instance‑level (using quality scores), integrates these weights into classification and regression losses, adds a contrastive InfoNCE term, and achieves state‑of‑the‑art temporal action localization performance across six benchmark datasets.

Action Sensitivity LearningComputer VisionDeep Learning
0 likes · 8 min read
Action Sensitivity Learning for Temporal Action Localization
Ele.me Technology
Ele.me Technology
Aug 16, 2023 · Artificial Intelligence

Spatiotemporal-Enhanced Network for Click-Through Rate Prediction in Location‑Based Services

The paper introduces StEN, a spatiotemporal-enhanced network for CTR prediction in location-based services, combining static spatiotemporal feature activation, dynamic preference activation, and target attention, achieving state-of-the-art offline results and a 1.6% CTR lift in online tests.

Deep LearningRecommendation Systemsclick-through rate
0 likes · 19 min read
Spatiotemporal-Enhanced Network for Click-Through Rate Prediction in Location‑Based Services
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 16, 2023 · Artificial Intelligence

Deep Dive into OCR – Chapter 2: Development and Classification of OCR Technology

This article provides a comprehensive overview of OCR technology, detailing the evolution from traditional hand‑crafted methods to modern deep‑learning approaches, describing image preprocessing, text detection and recognition pipelines, summarizing classic machine‑learning algorithms, and presenting a practical OpenCV implementation with Python code.

Computer VisionDeep LearningOCR
0 likes · 23 min read
Deep Dive into OCR – Chapter 2: Development and Classification of OCR Technology
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 12, 2023 · Artificial Intelligence

An Introduction to OCR: Concepts, History, Applications, Datasets, and Technical Workflow

This article provides a comprehensive overview of Optical Character Recognition (OCR), covering its definition, historical development, classification, real‑world applications, technical pipeline, common challenges, mitigation strategies, popular datasets, model performance comparisons, and leading open‑source platforms.

Computer VisionDatasetsDeep Learning
0 likes · 16 min read
An Introduction to OCR: Concepts, History, Applications, Datasets, and Technical Workflow
Kuaishou Tech
Kuaishou Tech
Aug 11, 2023 · Artificial Intelligence

PEPNet: Parameter and Embedding Personalized Network for Multi‑Task Multi‑Domain Recommendation

The paper introduces PEPNet, a plug‑and‑play network that tackles the domain‑seesaw and task‑seesaw problems in multi‑scenario recommendation by using a gated personalization module (GateNU) together with embedding‑level (EPNet) and parameter‑level (PPNet) personalization, and demonstrates its superiority through extensive offline and online experiments on Kuaishou data.

Deep LearningEmbeddinggate network
0 likes · 11 min read
PEPNet: Parameter and Embedding Personalized Network for Multi‑Task Multi‑Domain Recommendation
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 31, 2023 · Artificial Intelligence

Overview of Deep Neural Network Architectures

This article provides a comprehensive overview of deep neural network families, introducing twelve major architectures—including Feedforward, CNN, RNN, LSTM, DBN, GAN, Autoencoder, Residual, Capsule, Transformer, Attention, and Deep Reinforcement Learning—explaining their principles, structures, training methods, and offering Python/TensorFlow/PyTorch code examples.

CNNDeep LearningGAN
0 likes · 29 min read
Overview of Deep Neural Network Architectures
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 26, 2023 · Artificial Intelligence

Building and Training a Fully Connected Neural Network for Fashion-MNIST Classification with PyTorch

This tutorial demonstrates how to download the Fashion‑MNIST dataset, build a four‑layer fully connected neural network with PyTorch, and train it using loss functions, Adam optimizer, learning‑rate strategies, and Dropout to achieve high‑accuracy multi‑class image classification.

AdamDeep LearningDropout
0 likes · 17 min read
Building and Training a Fully Connected Neural Network for Fashion-MNIST Classification with PyTorch
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 24, 2023 · Artificial Intelligence

Understanding Slide-Transformer: An Efficient Local Attention Module for Vision Transformers

This article explains the Slide-Transformer paper, describing how the proposed Slide Attention replaces inefficient Im2Col‑based local attention with depthwise convolutions and a deformable shift module, achieving high efficiency, flexibility, and hardware‑agnostic performance for Vision Transformers.

Computer VisionDeep LearningDeformable Shift
0 likes · 13 min read
Understanding Slide-Transformer: An Efficient Local Attention Module for Vision Transformers
Nightwalker Tech
Nightwalker Tech
Jul 19, 2023 · Artificial Intelligence

Step‑by‑Step Implementation of Transformer Blocks, Attention, Normalization, Feed‑Forward, Encoder and Decoder in PyTorch

This article provides a comprehensive tutorial on building the core components of a Transformer model—including multi‑head attention, layer normalization, feed‑forward networks, encoder and decoder layers—and assembles them into a complete PyTorch implementation, supplemented with explanatory diagrams and runnable code.

DecoderDeep LearningEncoder
0 likes · 13 min read
Step‑by‑Step Implementation of Transformer Blocks, Attention, Normalization, Feed‑Forward, Encoder and Decoder in PyTorch
Test Development Learning Exchange
Test Development Learning Exchange
Jul 12, 2023 · Fundamentals

Common Python Libraries and Practical Projects: NumPy, Pandas, Matplotlib, Scikit‑learn, Requests, Beautiful Soup, Selenium, Pygame, Flask, PyTorch

This article introduces ten widely used Python libraries—NumPy, Pandas, Matplotlib, Scikit‑learn, Requests, Beautiful Soup, Selenium, Pygame, Flask, and PyTorch—each accompanied by a concise real‑world project and complete code examples to help readers understand and apply them effectively.

Data ScienceDeep LearningGame Development
0 likes · 18 min read
Common Python Libraries and Practical Projects: NumPy, Pandas, Matplotlib, Scikit‑learn, Requests, Beautiful Soup, Selenium, Pygame, Flask, PyTorch
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 12, 2023 · Artificial Intelligence

Comprehensive Guide to Vision Transformer (ViT): Architecture, Patch Tokenization, Embedding, Fine‑tuning, and Performance

This article provides an in‑depth, English‑language overview of Vision Transformer (ViT), covering its Transformer‑based architecture, patch‑to‑token conversion, token and position embeddings, fine‑tuning strategies such as 2‑D interpolation, experimental results versus CNNs, and the model’s broader significance for multimodal AI research.

Computer VisionDeep LearningFine‑tuning
0 likes · 25 min read
Comprehensive Guide to Vision Transformer (ViT): Architecture, Patch Tokenization, Embedding, Fine‑tuning, and Performance
Kuaishou Large Model
Kuaishou Large Model
Jul 7, 2023 · Artificial Intelligence

How HairStep Revolutionizes Single-View 3D Hair Reconstruction

This paper introduces HairStep, a novel intermediate representation combining Strand Maps and Depth Maps, and demonstrates how it reduces domain gap and improves single‑view 3D hair reconstruction accuracy across multiple algorithms, supported by new annotated datasets (HiSa, HiDa) and fair evaluation metrics.

3D hair reconstructionComputer VisionDataset
0 likes · 11 min read
How HairStep Revolutionizes Single-View 3D Hair Reconstruction
DataFunSummit
DataFunSummit
Jul 1, 2023 · Artificial Intelligence

Alibaba Cloud Native Deep Learning Platform PAI‑DLC: Architecture, Features, and Future Outlook

This article introduces Alibaba Cloud's PAI‑DLC, a cloud‑native deep learning platform that integrates machine‑learning capabilities, containerized services, AI‑aware scheduling, GPU virtualization, elastic training with EasyScale, data access, and observability, and discusses its architecture, key features, and future directions.

AI PlatformCloud NativeDeep Learning
0 likes · 16 min read
Alibaba Cloud Native Deep Learning Platform PAI‑DLC: Architecture, Features, and Future Outlook
Architecture & Thinking
Architecture & Thinking
Jun 30, 2023 · Artificial Intelligence

How INT8 Quantization Supercharges Baidu's Search Models: Techniques and Insights

This article explores the rapid evolution of Baidu's semantic search models, the large GPU consumption they entail, and how extensive INT8 quantization, sensitivity analysis, calibration data augmentation, hyper‑parameter auto‑tuning, and advanced methods like Quantization‑Aware Training and SmoothQuant dramatically improve inference performance while preserving business metrics.

Deep LearningErnieINT8 Quantization
0 likes · 17 min read
How INT8 Quantization Supercharges Baidu's Search Models: Techniques and Insights
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Jun 28, 2023 · Artificial Intelligence

ShaderNN 2.0: A Lightweight Mobile Deep Learning Inference Engine with OpenGL and Vulkan Support

ShaderNN 2.0 is a lightweight mobile deep learning inference engine supporting OpenGL and Vulkan, offering texture‑based zero‑copy I/O, hybrid shader implementation, and achieving significant latency and power reductions versus TensorFlow Lite and MNN, thereby enabling real‑time graphics‑AI tasks such as style transfer, denoising, super‑sampling, and Stable Diffusion on smartphones.

Deep LearningGPU shaderOpenGL
0 likes · 16 min read
ShaderNN 2.0: A Lightweight Mobile Deep Learning Inference Engine with OpenGL and Vulkan Support
Bilibili Tech
Bilibili Tech
Jun 27, 2023 · Artificial Intelligence

Design and Implementation of a Real-Time Advertising Feature Platform for CTR Prediction at Bilibili

To eliminate data fragmentation, feature inconsistencies, and multi‑language implementation challenges, Bilibili built a unified real‑time advertising feature platform that aligns offline, hourly, and online pipelines via a shared C++ library and JNI, boosting CTR prediction accuracy, cutting training costs, and increasing ad revenue by over 1 %.

AdvertisingCTR predictionDeep Learning
0 likes · 11 min read
Design and Implementation of a Real-Time Advertising Feature Platform for CTR Prediction at Bilibili
Efficient Ops
Efficient Ops
Jun 26, 2023 · Artificial Intelligence

How Multimodal AI Is Revolutionizing Credit Card Fraud Detection

Amid tightening financial regulations, ICBC's software team proposes a multimodal AI anti‑fraud framework that combines image, video, and structured data to detect deep‑fake, mask, and forged‑document attacks, enriches verification with cross‑modal cues, and outlines future expansion to text and speech modalities.

AIComputer VisionDeep Learning
0 likes · 7 min read
How Multimodal AI Is Revolutionizing Credit Card Fraud Detection
Programmer DD
Programmer DD
Jun 25, 2023 · Artificial Intelligence

How to Build Image Search with Elasticsearch 8.x and CLIP Multilingual Model

This article explains the concept of image‑based search, why it matters, and provides a step‑by‑step guide to implement image search using Elasticsearch 8.x, feature‑extraction libraries, and the multilingual CLIP‑ViT‑B‑32 model, including code snippets and architecture overview.

Deep Learningclip modelfeature extraction
0 likes · 10 min read
How to Build Image Search with Elasticsearch 8.x and CLIP Multilingual Model
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Jun 20, 2023 · Artificial Intelligence

How a Low‑Latency Hierarchical Fusion Network Beats Echoes in Real‑Time Calls

At ICASSP 2023, Kuaishou’s audio team presented a low‑latency hierarchical fusion network for full‑band acoustic echo cancellation, detailing its multi‑stage design, asymmetric windowing, loss functions, training strategy, and achieving second place in the non‑personalized AEC Challenge, with real‑world deployment results.

Acoustic Echo CancellationDeep LearningHierarchical Fusion Network
0 likes · 13 min read
How a Low‑Latency Hierarchical Fusion Network Beats Echoes in Real‑Time Calls
21CTO
21CTO
Jun 10, 2023 · Artificial Intelligence

How Huang Xuedong’s Team Achieved Human-Level Speech Recognition at Microsoft

The article chronicles the career of Chinese AI pioneer Huang Xuedong, detailing his education, rise at Microsoft, leadership of Azure AI, groundbreaking human‑level speech recognition breakthroughs, the engineering feats behind them—including a ten‑network model and the CNTK framework—and his recent move to Zoom.

CNTKDeep LearningMicrosoft
0 likes · 14 min read
How Huang Xuedong’s Team Achieved Human-Level Speech Recognition at Microsoft
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Jun 9, 2023 · Artificial Intelligence

2023 NIRC PhD Graduates Reveal Cutting-Edge AI and Network Intelligence Research

In 2023 the Network Intelligent Research Center celebrated its largest PhD graduating class—seven scholars whose dissertations span deep‑vision hand‑gesture estimation, multi‑scenario network transmission, graph alignment, interactive streaming, knowledge‑defined networking, wireless body‑area networking, and more—showcasing significant AI‑driven advances and high‑impact publications.

Computer VisionDeep LearningGraph Alignment
0 likes · 30 min read
2023 NIRC PhD Graduates Reveal Cutting-Edge AI and Network Intelligence Research
Alimama Tech
Alimama Tech
May 31, 2023 · Artificial Intelligence

CF-Font: Content Fusion for Few-shot Font Generation

CF‑Font introduces a content‑fusion module that linearly mixes base‑font content features using a font‑level distance metric, combined with iterative style refinement and a projection character loss, achieving state‑of‑the‑art few‑shot Chinese font generation that outperforms prior methods by over 5% on L1 and FID and is already used to create proprietary Alibaba‑Mama fonts.

Deep Learningcontent fusionfew-shot font generation
0 likes · 10 min read
CF-Font: Content Fusion for Few-shot Font Generation
DataFunSummit
DataFunSummit
May 31, 2023 · Artificial Intelligence

Evolution of Face Detection Techniques: Datasets, Research Directions, and Future Work

This article reviews the evolution of face detection, covering the Widely‑Face dataset, major research directions such as feature fusion, label assignment, auxiliary supervision, anchor‑free methods, NAS‑based designs, summarizes key papers from S3FD to MogFace, introduces ModelScope implementations, and outlines future challenges and opportunities.

AI researchComputer VisionDatasets
0 likes · 13 min read
Evolution of Face Detection Techniques: Datasets, Research Directions, and Future Work
JD Retail Technology
JD Retail Technology
May 16, 2023 · Artificial Intelligence

Deploying and Fine‑Tuning the Alpaca‑LoRA Large Language Model on a Multi‑GPU Server

This guide details the end‑to‑end process of installing GPU drivers, setting up a Python environment, deploying the open‑source Alpaca‑LoRA model, fine‑tuning it with Chinese data on a multi‑GPU server, and performing inference, while highlighting practical challenges and performance observations.

Alpaca-LoRADeep LearningFine-tuning
0 likes · 11 min read
Deploying and Fine‑Tuning the Alpaca‑LoRA Large Language Model on a Multi‑GPU Server
Architects' Tech Alliance
Architects' Tech Alliance
May 15, 2023 · Artificial Intelligence

How Transformer Powers ChatGPT: A Deep Dive into Attention and Architecture

This article provides a comprehensive analysis of the Transformer model behind ChatGPT, covering its origin, core mechanisms such as embedding, positional encoding, self‑attention, multi‑head attention, a step‑by‑step translation example, and the broader implications for AI research and industry.

AI ArchitectureAttention MechanismChatGPT
0 likes · 19 min read
How Transformer Powers ChatGPT: A Deep Dive into Attention and Architecture
Full-Stack Trendsetter
Full-Stack Trendsetter
May 15, 2023 · Artificial Intelligence

Do You Really Understand ChatGPT, the Era‑Defining AI?

This article explains what ChatGPT is, how it builds on natural-language-processing and the Transformer-based GPT series, details its model-size growth, architectural enhancements, multilingual support, and walks through the tokenization-to-generation pipeline that enables coherent AI-driven conversations.

ChatGPTDeep LearningGPT-3
0 likes · 8 min read
Do You Really Understand ChatGPT, the Era‑Defining AI?
DataFunTalk
DataFunTalk
May 13, 2023 · Artificial Intelligence

Multimedia Content Understanding at Weibo: Video Summarization, Quality Assessment, OCR, Embedding, and CV‑CUDA Optimization

This article presents Weibo's comprehensive multimedia content understanding pipeline, covering video summarization techniques, quality assessment models, OCR advancements, video embedding strategies, and the performance benefits of CV‑CUDA acceleration, while highlighting real‑world applications and engineering trade‑offs.

CV-CUDAComputer VisionDeep Learning
0 likes · 32 min read
Multimedia Content Understanding at Weibo: Video Summarization, Quality Assessment, OCR, Embedding, and CV‑CUDA Optimization
Alimama Tech
Alimama Tech
May 10, 2023 · Artificial Intelligence

How AdaSparse Boosts Multi‑Scenario CTR Prediction with Adaptive Sparse Networks

AdaSparse introduces an adaptive sparse network that learns a dedicated sub‑network for each advertising scenario, balancing shared and specific knowledge while keeping computational cost low, and achieves +4.63% CTR and -3.82% CPC improvements in Alibaba’s external ad system, as validated on both public and massive production datasets.

AdvertisingCTR predictionDeep Learning
0 likes · 20 min read
How AdaSparse Boosts Multi‑Scenario CTR Prediction with Adaptive Sparse Networks
DaTaobao Tech
DaTaobao Tech
Apr 28, 2023 · Artificial Intelligence

Multi-Scenario Recommendation Model

The paper introduces SASS, a scenario-adaptive self-supervised recommendation model that uses contrastive pre-training and multi-layer gating to expand global samples and transfer scene-aware parameters, enabling a single model to deliver personalized recommendations across diverse Taobao ‘SuoSuo’ scenarios while mitigating data sparsity and cross-domain challenges.

AIDeep LearningRecommendation Systems
0 likes · 23 min read
Multi-Scenario Recommendation Model
21CTO
21CTO
Apr 27, 2023 · Artificial Intelligence

Demystifying Transformers: A Step‑by‑Step Guide to Self‑Attention and Architecture

This article explains the Transformer model—from its encoder‑decoder structure and self‑attention mechanism to multi‑head attention, positional encoding, residual connections, training loss, and inference strategies—providing a clear, visual walkthrough for readers new to modern NLP architectures.

Deep LearningSelf-AttentionTransformer
0 likes · 21 min read
Demystifying Transformers: A Step‑by‑Step Guide to Self‑Attention and Architecture
High Availability Architecture
High Availability Architecture
Apr 27, 2023 · Artificial Intelligence

Design and Optimization of Bilibili's Large‑Scale Video Duplicate Detection System

This article describes the design, algorithmic improvements, and engineering performance optimizations of Bilibili's massive video duplicate detection (collision) system, covering challenges of low‑edit‑degree reposts, two‑stage retrieval, self‑supervised feature extraction, GPU‑accelerated preprocessing, and the resulting gains in accuracy and throughput.

BilibiliDeep Learningfeature extraction
0 likes · 17 min read
Design and Optimization of Bilibili's Large‑Scale Video Duplicate Detection System
DevOps
DevOps
Apr 25, 2023 · Artificial Intelligence

The Bitter Lesson: Why Brute‑Force Computation Outperforms Hand‑Crafted Knowledge in AI

Richard Sutton’s “The Bitter Lesson” argues that over the past seven decades the most powerful driver of AI progress has been general‑purpose compute and large‑scale search, which consistently surpasses methods that rely on human‑engineered knowledge across domains such as chess, Go, speech recognition, and computer vision.

AIDeep Learningbrute force
0 likes · 7 min read
The Bitter Lesson: Why Brute‑Force Computation Outperforms Hand‑Crafted Knowledge in AI
Bilibili Tech
Bilibili Tech
Apr 21, 2023 · Artificial Intelligence

Design and Optimization of Bilibili's Large-Scale Video Duplicate Detection System

Bilibili built a massive video‑duplicate detection platform that trains a self‑supervised ResNet‑50 feature extractor, removes black borders, and uses a two‑stage ANN‑plus‑segment‑level matching pipeline accelerated by custom GPU decoding and inference, boosting duplicate rejection 7.5×, recall 3.75×, and cutting manual misses from 65 to 5 per day.

Deep LearningGPU Accelerationfeature extraction
0 likes · 19 min read
Design and Optimization of Bilibili's Large-Scale Video Duplicate Detection System
DataFunSummit
DataFunSummit
Apr 11, 2023 · Artificial Intelligence

OneFlow Coop: Joint Optimization of Dynamic‑Graph Recomputation and Memory Allocation

This article introduces OneFlow Coop, a memory‑optimization technique that jointly optimizes dynamic‑graph recomputation strategies and GPU memory allocation by analyzing existing DTR limitations, proposing recomputable in‑place, op‑guided tensor allocation, and layout‑aware eviction modules, and demonstrating superior experimental results.

Deep LearningDynamic GraphGPU Memory
0 likes · 18 min read
OneFlow Coop: Joint Optimization of Dynamic‑Graph Recomputation and Memory Allocation
Programmer DD
Programmer DD
Apr 10, 2023 · Artificial Intelligence

Why ChatGPT Sparks Panic and What Its Real Technical Foundations Are

In this talk, AI expert Wu Jun explains why ChatGPT has caused widespread fear, traces the historical development of language models from the 1970s to today, clarifies the massive computational and data requirements, and discusses the real impact and opportunities of large‑scale AI systems.

AI hypeChatGPTDeep Learning
0 likes · 20 min read
Why ChatGPT Sparks Panic and What Its Real Technical Foundations Are
DataFunTalk
DataFunTalk
Apr 3, 2023 · Artificial Intelligence

Implementing RNN, LSTM, and GRU with PyTorch

This article introduces the basic architectures of recurrent neural networks (RNN), LSTM, and GRU, explains PyTorch APIs such as nn.RNN, nn.LSTM, nn.GRU, details their parameters, demonstrates code examples for building and testing these models, and provides practical insights for deep learning practitioners.

Deep LearningGRULSTM
0 likes · 9 min read
Implementing RNN, LSTM, and GRU with PyTorch
DataFunTalk
DataFunTalk
Apr 1, 2023 · Artificial Intelligence

Nvidia Meets OpenAI: Highlights from the GTC Fireside Chat on GPT‑4, Deep Learning History, and the Future of AI

In a GTC fireside chat, Nvidia CEO Jensen Huang and OpenAI co‑founder Ilya Sutskever discuss GPT‑4's multimodal advances, the evolution of deep learning from early neural networks to large‑scale models, the pivotal role of GPUs and datasets like ImageNet, and their vision for more reliable, scalable artificial intelligence.

Deep LearningGPT-4Neural Networks
0 likes · 10 min read
Nvidia Meets OpenAI: Highlights from the GTC Fireside Chat on GPT‑4, Deep Learning History, and the Future of AI
21CTO
21CTO
Mar 31, 2023 · Artificial Intelligence

From Student to AI Pioneer: Ilya Sutskever’s Journey Behind ChatGPT

This article chronicles Ilya Sutskever’s two‑decade rise from a young researcher to a leading figure in artificial intelligence, highlighting his early mentorship, breakthroughs in image recognition, language translation, the founding of OpenAI, and the development of GPT and DALL‑E models.

AI researchDeep LearningGPT
0 likes · 13 min read
From Student to AI Pioneer: Ilya Sutskever’s Journey Behind ChatGPT
DataFunTalk
DataFunTalk
Mar 24, 2023 · Artificial Intelligence

Deep UPLIFT Modeling: Techniques, Challenges, and FinTech Applications

This article provides a comprehensive overview of deep UPLIFT models, covering their fundamentals, key technical challenges such as confounding bias and inductive bias, the evolution of meta‑learner and deep architectures, and practical case studies in financial technology marketing.

Deep LearningFinTechMarketing Optimization
0 likes · 14 min read
Deep UPLIFT Modeling: Techniques, Challenges, and FinTech Applications
Python Programming Learning Circle
Python Programming Learning Circle
Mar 22, 2023 · Artificial Intelligence

Overview of PyTorch 2.0 Features and New APIs

The article provides a detailed overview of PyTorch 2.0, highlighting its stable and beta features such as torch.compile, accelerated transformers, MPS backend, new quantization support, and prototype parallelism tools, while emphasizing performance improvements for dynamic shapes, distributed training, and CPU/GPU inference.

AIAccelerated TransformersDeep Learning
0 likes · 6 min read
Overview of PyTorch 2.0 Features and New APIs