Tagged articles

Adam

4 articles · Page 1 of 1

Aug 2, 2025 · Artificial Intelligence

Deep Learning Optimizers Demystified: Momentum, AdaGrad, RMSProp & Adam Explained

This article breaks down the core deep‑learning optimizers—gradient descent, Momentum, AdaGrad, RMSProp and Adam—showing why vanilla gradient descent converges slowly, how each method uses exponential moving averages to accelerate training, and why Adam is generally the preferred choice.

AdaGradAdamDeep Learning

0 likes · 8 min read

Deep Learning Optimizers Demystified: Momentum, AdaGrad, RMSProp & Adam Explained

Rare Earth Juejin Tech Community

Jul 26, 2023 · Artificial Intelligence

Building and Training a Fully Connected Neural Network for Fashion-MNIST Classification with PyTorch

This tutorial demonstrates how to download the Fashion‑MNIST dataset, build a four‑layer fully connected neural network with PyTorch, and train it using loss functions, Adam optimizer, learning‑rate strategies, and Dropout to achieve high‑accuracy multi‑class image classification.

AdamDeep LearningDropout

0 likes · 17 min read

Building and Training a Fully Connected Neural Network for Fashion-MNIST Classification with PyTorch

Code DAO

Dec 6, 2021 · Artificial Intelligence

Why So Many Optimizers? Core Algorithms Behind Neural Network Training

This article explains the fundamental gradient‑descent optimizers used in neural networks—SGD, Momentum, RMSProp, Adam and their variants—illustrates loss‑surface challenges such as local minima, saddle points and ravines, and shows how techniques like mini‑batching, momentum, adaptive learning rates and scheduling address these issues.

AdamDeep LearningMomentum

0 likes · 11 min read

Why So Many Optimizers? Core Algorithms Behind Neural Network Training

Hulu Beijing

Jan 4, 2018 · Artificial Intelligence

Why SGD Fails and How Momentum, AdaGrad, and Adam Fix It

This article explains why vanilla Stochastic Gradient Descent often struggles in deep learning, describes the challenges of valleys and saddle points, and introduces three major SGD variants—Momentum, AdaGrad, and Adam—detailing their motivations, update rules, and advantages.

AdaGradAdamMomentum

0 likes · 13 min read

Why SGD Fails and How Momentum, AdaGrad, and Adam Fix It