Tagged articles

Residual Networks

4 articles · Page 1 of 1
Data Party THU
Data Party THU
Apr 3, 2026 · Artificial Intelligence

Can Attention Replace Residuals? Inside the New Attention Residuals Breakthrough

The article reviews the Kimi team's Attention Residuals approach, which substitutes traditional ResNet additive shortcuts with learned attention‑based weighting, explains the theoretical motivation linking depth to time, details full‑attention and block‑wise implementations, presents experimental results showing up to 1.25× compute efficiency and improved performance on reasoning and knowledge tasks.

Attention MechanismDeep LearningModel Efficiency
0 likes · 11 min read
Can Attention Replace Residuals? Inside the New Attention Residuals Breakthrough
Python Programming Learning Circle
Python Programming Learning Circle
Jul 6, 2021 · Artificial Intelligence

Understanding ResNet and Building It from Scratch with PyTorch

This article explains the motivation behind residual networks, describes the architecture of ResNet including residual blocks and skip connections, lists available Keras implementations, and provides a step‑by‑step PyTorch tutorial with complete code to construct and test ResNet‑50/101/152 models.

CNNDeep LearningPyTorch
0 likes · 10 min read
Understanding ResNet and Building It from Scratch with PyTorch
DataFunTalk
DataFunTalk
Dec 25, 2019 · Artificial Intelligence

Exploring Depth in Graph Convolutional Networks (GCN): Architecture, Experiments, and Future Work

This article examines the challenges of deepening Graph Convolutional Networks (GCN), introduces ResGCN, DenseGCN, and skip‑neighbor designs to enable deeper architectures, presents experimental results showing improved performance with 28‑layer models, and outlines future research directions.

Deep LearningGCNGraph Neural Networks
0 likes · 7 min read
Exploring Depth in Graph Convolutional Networks (GCN): Architecture, Experiments, and Future Work
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 26, 2016 · Artificial Intelligence

ICML Tutorial Highlights: Deep Residual Nets, Stochastic Gradient, Deep RL

At the ICML pre‑conference tutorial, experts presented deep residual networks, stochastic gradient methods for large‑scale learning, and deep reinforcement learning, highlighting architectural innovations, optimization theory, noise‑reduction techniques, and practical considerations for building scalable, high‑performance AI models.

Deep LearningResidual Networksstochastic gradient
0 likes · 14 min read
ICML Tutorial Highlights: Deep Residual Nets, Stochastic Gradient, Deep RL