Tag

deep learning

0 views collected around this technical thread.

Architects' Tech Alliance
Architects' Tech Alliance
Jun 15, 2025 · Fundamentals

Master GPU Fundamentals: Architecture, Performance, and Programming Insights

This comprehensive guide covers GPU definitions, evolution, core components, architectural designs, performance metrics, programming models, deep‑learning applications, comparisons with other processors, practical use cases, optimization techniques, and future trends, providing a solid foundation for anyone interested in modern graphics and compute acceleration.

Computer ArchitectureGPUHardware
0 likes · 43 min read
Master GPU Fundamentals: Architecture, Performance, and Programming Insights
Zhihu Tech Column
Zhihu Tech Column
Jun 11, 2025 · Artificial Intelligence

How Minute‑Level Time Decay Boosts User Retention Modeling in Recommendation Systems

This article presents a novel minute‑level future‑reward framework with dual‑delay incentives, activity‑based attribution, multi‑task delayed modeling, and sequential streaming training that dramatically improves user retention prediction accuracy and real‑time performance in large‑scale recommendation platforms.

deep learningmulti‑task modelingreal‑time prediction
0 likes · 17 min read
How Minute‑Level Time Decay Boosts User Retention Modeling in Recommendation Systems
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Jun 11, 2025 · Artificial Intelligence

Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI

Kuaishou presented twelve peer‑reviewed papers at CVPR 2025 covering video quality assessment, large‑scale video datasets, dynamic 3D avatar reconstruction, 4D scene simulation, controllable video generation, scaling laws for diffusion transformers, multimodal foundations, and more, highlighting the company's leading research in computer vision and AI.

AI researchCVPR2025Computer Vision
0 likes · 21 min read
Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI
DaTaobao Tech
DaTaobao Tech
May 16, 2025 · Artificial Intelligence

JianYi: AI‑Powered Image Segmentation and Matting System for Taobao Home‑Decoration

The article introduces JianYi, a self‑developed image segmentation and matting system for Taobao's home‑decoration business that supports product, human, and panoramic segmentation with multi‑modal interaction, achieving high‑precision real‑time performance and powering AI tools such as "Jiazuo" and "Fang Wo Jia".

Artificial IntelligenceComputer Visiondeep learning
0 likes · 11 min read
JianYi: AI‑Powered Image Segmentation and Matting System for Taobao Home‑Decoration
Amap Tech
Amap Tech
May 8, 2025 · Artificial Intelligence

FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

FantasyTalking generates high-fidelity, coherent talking portraits from a single static image by employing a two-stage audio-visual alignment—global segment-level motion and frame-level lip refinement—combined with face-centric cross-attention for identity preservation and a motion-intensity module that lets users control expression and body movement, achieving superior realism, synchronization, and performance over prior methods.

audio-visual alignmentdeep learningidentity preservation
0 likes · 10 min read
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Architects' Tech Alliance
Architects' Tech Alliance
May 6, 2025 · Artificial Intelligence

Evolution of NVIDIA GPU Architectures for AI from Volta to Blackwell

The article reviews NVIDIA's GPU architecture progression—from Volta's pioneering Tensor Cores through Turing, Ampere, Hopper, and the latest Blackwell and Rubin designs—highlighting key innovations, performance gains for deep learning, and related resource updates for AI practitioners.

Artificial IntelligenceGPU architectureHigh Performance Computing
0 likes · 9 min read
Evolution of NVIDIA GPU Architectures for AI from Volta to Blackwell
IT Services Circle
IT Services Circle
May 2, 2025 · Artificial Intelligence

Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It

The article explains why deep networks suffer from gradient vanishing—especially when using sigmoid or tanh activations—covers the underlying mathematics, compares activation functions, and presents practical techniques such as proper weight initialization, batch normalization, residual connections, and code examples to visualize the phenomenon.

ResNetactivation functionsbatch normalization
0 likes · 7 min read
Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It
JD Tech
JD Tech
Apr 30, 2025 · Artificial Intelligence

TimeHF: A Billion‑Scale Time Series Forecasting Model Guided by Human Feedback

The JD Supply Chain algorithm team introduces TimeHF, a billion‑parameter time‑series large model that leverages RLHF to boost demand‑forecast accuracy by over 10%, detailing dataset construction, the PCTLM architecture, a custom RLHF framework (TPO), and extensive SOTA experimental results.

Big DataRLHFdeep learning
0 likes · 10 min read
TimeHF: A Billion‑Scale Time Series Forecasting Model Guided by Human Feedback
Didi Tech
Didi Tech
Apr 24, 2025 · Artificial Intelligence

Algorithmic Foundations and Evolution of Natural Language Processing

The article surveys the Algorithmic Foundations of Engineering R&D series, tracing NLP’s evolution from rule‑based systems to today’s multimodal large‑model era, reviewing core machine‑learning and deep‑learning techniques, transformer breakthroughs, representation learning, optimization methods, and emerging research such as retrieval‑augmented generation and AI agents.

AINLPTransformer
0 likes · 43 min read
Algorithmic Foundations and Evolution of Natural Language Processing
Cognitive Technology Team
Cognitive Technology Team
Apr 12, 2025 · Artificial Intelligence

Analyzing a Trained Neural Network: Visualizing Hidden Layers and Understanding Its Limitations

This article walks through an interactive exploration of a simple two‑hidden‑layer neural network, showing how real‑time visualizations reveal its learned representations, accuracy limits, and why constrained training leads to over‑confident yet unintelligent predictions before introducing backpropagation.

backpropagationdeep learninghidden layers
0 likes · 10 min read
Analyzing a Trained Neural Network: Visualizing Hidden Layers and Understanding Its Limitations
JD Tech Talk
JD Tech Talk
Apr 11, 2025 · Artificial Intelligence

A Billion-Scale Pure Time Series Large Model: PCTLM with SFT and TPO for Forecasting

This article presents a pioneering billion‑parameter pure time‑series large model (PCTLM) trained on a 1.5‑billion‑sample dataset, introduces a novel RLHF framework (TPO) for time‑series forecasting, and demonstrates state‑of‑the‑art performance across multiple public benchmarks, surpassing existing models such as GPT4TS.

Big DataPCTLMRLHF
0 likes · 11 min read
A Billion-Scale Pure Time Series Large Model: PCTLM with SFT and TPO for Forecasting
Cognitive Technology Team
Cognitive Technology Team
Apr 9, 2025 · Artificial Intelligence

How Neural Networks Learn: Gradient Descent and Loss Functions

This article explains how neural networks learn by using labeled training data, describing the role of weights, biases, activation functions, and how gradient descent iteratively adjusts parameters to minimize loss, illustrated with the MNIST digit‑recognition example.

MNISTdeep learninggradient descent
0 likes · 16 min read
How Neural Networks Learn: Gradient Descent and Loss Functions
JD Tech
JD Tech
Apr 8, 2025 · Artificial Intelligence

MaRCA: Multi‑Agent Reinforcement Learning Computation Allocation for Full‑Chain Advertising Systems

The article presents MaRCA, a multi‑agent reinforcement learning framework that models user value, compute consumption, and action reward to allocate limited computation resources across the entire advertising recommendation pipeline, achieving higher ad revenue while keeping system load stable under fluctuating traffic and diverse request values.

Advertising SystemsMulti-Agent Reinforcement Learningcomputation allocation
0 likes · 16 min read
MaRCA: Multi‑Agent Reinforcement Learning Computation Allocation for Full‑Chain Advertising Systems
Cognitive Technology Team
Cognitive Technology Team
Apr 8, 2025 · Artificial Intelligence

Understanding Neural Networks: Structure, Layers, and Activation

This article explains how a simple neural network can recognize handwritten digits by preprocessing images, organizing neurons into input, hidden, and output layers, using weighted sums, biases, sigmoid compression, and matrix multiplication to illustrate the fundamentals of deep learning.

Layersactivation functionsdeep learning
0 likes · 16 min read
Understanding Neural Networks: Structure, Layers, and Activation
Python Programming Learning Circle
Python Programming Learning Circle
Apr 3, 2025 · Artificial Intelligence

Accelerating PyTorch Model Training: Techniques, Benchmarks, and Code

This article explains how to dramatically speed up PyTorch model training using code optimizations, mixed‑precision, torch.compile, distributed data parallelism, and DeepSpeed, presenting benchmark results that show up to 11.5× acceleration on multiple GPUs while maintaining high accuracy.

DeepSpeedGPUMixed Precision
0 likes · 6 min read
Accelerating PyTorch Model Training: Techniques, Benchmarks, and Code
Baidu Tech Salon
Baidu Tech Salon
Apr 2, 2025 · Artificial Intelligence

PaddlePaddle Framework 3.0 Released: Five Core Innovations for Large Models and Scientific Computing

PaddlePaddle 3.0, launched on April 1 2025, introduces five core innovations—including dynamic‑static unified automatic parallelism, a training‑inference integrated PIR, high‑order automatic differentiation for scientific computing, a one‑stage CINN compiler, and heterogeneous multi‑chip adaptation—that dramatically reduce distributed‑training code, boost performance up to four‑fold, and extend the framework to aerospace, automotive, meteorology and life‑science applications while remaining fully compatible with the 2.0 API.

Large ModelsNeural Network CompilerPaddlePaddle
0 likes · 21 min read
PaddlePaddle Framework 3.0 Released: Five Core Innovations for Large Models and Scientific Computing
Cognitive Technology Team
Cognitive Technology Team
Mar 31, 2025 · Artificial Intelligence

Recommendation Algorithms: Using Mathematical Methods for Efficient Information Matching

Recommendation algorithms, rooted in machine learning and deep learning, transform massive user‑generated data into mathematical models that filter and personalize content, covering traditional collaborative filtering, matrix factorization, cosine similarity, and modern deep models such as Wide & Deep and Two‑Tower retrieval, illustrating their evolution and practical applications.

collaborative filteringdeep learningmachine learning
0 likes · 14 min read
Recommendation Algorithms: Using Mathematical Methods for Efficient Information Matching
Cognitive Technology Team
Cognitive Technology Team
Mar 31, 2025 · Artificial Intelligence

Understanding Douyin's Recommendation Algorithm: From Behavior Prediction to Value Modeling

The article explains how Douyin's recommendation system uses machine‑learning and deep‑learning models to predict user actions, assign value weights, and dynamically adjust scores, highlighting both its efficiency in large‑scale content distribution and its inherent limitations compared to human understanding.

AIdeep learningmachine learning
0 likes · 7 min read
Understanding Douyin's Recommendation Algorithm: From Behavior Prediction to Value Modeling
Architects' Tech Alliance
Architects' Tech Alliance
Mar 28, 2025 · Artificial Intelligence

Evolution of NVIDIA GPU Architectures for Deep Learning: From Volta to Blackwell and Rubin

The article traces NVIDIA’s GPU architecture evolution from the Volta era’s pioneering Tensor Cores through Turing, Ampere, Hopper, and the latest Blackwell and Rubin designs, highlighting key innovations such as mixed‑precision support, sparsity, NVLink, and their impact on deep‑learning performance.

AI hardwareGPUNvidia
0 likes · 10 min read
Evolution of NVIDIA GPU Architectures for Deep Learning: From Volta to Blackwell and Rubin
JD Retail Technology
JD Retail Technology
Mar 18, 2025 · Artificial Intelligence

Multi‑Agent Reinforcement Learning Based Full‑Chain Computation Allocation (MaRCA) for Advertising Systems

MaRCA, a multi‑agent reinforcement‑learning framework, allocates compute across JD’s advertising playback chain by jointly estimating user value, resource consumption, and action outcomes while dynamically adjusting to real‑time load, achieving roughly 15 % higher ad revenue without extra compute resources.

advertisingcompute schedulingdeep learning
0 likes · 18 min read
Multi‑Agent Reinforcement Learning Based Full‑Chain Computation Allocation (MaRCA) for Advertising Systems