Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It

The article explains why deep networks suffer from gradient vanishing—especially when using sigmoid or tanh activations—covers the underlying mathematics, compares activation functions, and presents practical techniques such as proper weight initialization, batch normalization, residual connections, and code examples to visualize the phenomenon.

Batch NormalizationDeep LearningNeural Networks

0 likes · 7 min read

Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It

Code DAO

Dec 5, 2021 · Artificial Intelligence

Why Neural Networks Need Batch Normalization: Principles and Mechanics

The article explains the principle behind Batch Normalization, why it is essential for training deep neural networks, how it standardizes activations, the role of learnable scale and shift parameters, the computation steps during training and inference, and discusses placement strategies within a model.

Batch NormalizationDeep LearningNeural Networks

0 likes · 9 min read

Why Neural Networks Need Batch Normalization: Principles and Mechanics

Alibaba Cloud Developer

Sep 6, 2018 · Artificial Intelligence

How Wide‑ResNet with Batch Norm Boosts 1688’s ‘You May Like’

This article introduces the Wide&Deep, PNN, DeepFM, and a novel Wide‑ResNet model applied to Alibaba’s 1688 “You May Like” recommendation, describes the system architecture, training data, experimental results showing AUC improvements with batch normalization, and shares practical tuning insights.

Batch NormalizationResNetdeepfm

0 likes · 12 min read

How Wide‑ResNet with Batch Norm Boosts 1688’s ‘You May Like’

Tencent TDS Service

Jun 7, 2018 · Artificial Intelligence

Upgrading HED Edge Detection to TensorFlow 1.7: Refactored Code and New Layer Techniques

This tutorial walks through rewriting the HED edge‑detection network for TensorFlow 1.7, covering deprecated API fixes, migration from TF‑Slim to tf.layers, matrix initialization, batch normalization nuances, and a comprehensive review of convolution variants such as 1×1, depthwise, separable, and dilated convolutions, plus guidance on transposed convolutions and modern architectures like ResNet and Inception.

Batch NormalizationCNNConvolution

0 likes · 24 min read

Upgrading HED Edge Detection to TensorFlow 1.7: Refactored Code and New Layer Techniques

Hulu Beijing

Jan 25, 2018 · Artificial Intelligence

How Batch Normalization Accelerates Neural Network Training and Improves Generalization

This article explains the motivation, core principles, and implementation details of Batch Normalization, including how it normalizes each mini‑batch, restores learned feature distributions, and is applied in convolutional neural networks to speed up training and boost model generalization.

Batch NormalizationCNNDeep Learning

0 likes · 6 min read

How Batch Normalization Accelerates Neural Network Training and Improves Generalization