Tagged articles

SGD

6 articles · Page 1 of 1

May 16, 2022 · Artificial Intelligence

How to Build a Simple Neural Network from Scratch with NumPy

This article walks through implementing a basic multi‑layer neural network using only NumPy, covering terminology, network architecture, forward and backward propagation, activation functions, loss calculation, parameter updates with SGD, and compares the custom model with a Keras implementation.

BackpropagationNeural NetworkNumPy

0 likes · 17 min read

How to Build a Simple Neural Network from Scratch with NumPy

Code DAO

Dec 6, 2021 · Artificial Intelligence

Why So Many Optimizers? Core Algorithms Behind Neural Network Training

This article explains the fundamental gradient‑descent optimizers used in neural networks—SGD, Momentum, RMSProp, Adam and their variants—illustrates loss‑surface challenges such as local minima, saddle points and ravines, and shows how techniques like mini‑batching, momentum, adaptive learning rates and scheduling address these issues.

AdamDeep LearningMomentum

0 likes · 11 min read

Why So Many Optimizers? Core Algorithms Behind Neural Network Training

360 Tech Engineering

Sep 16, 2019 · Artificial Intelligence

Backpropagation Algorithm for Fully Connected Neural Networks with Python Implementation

This article explains the backpropagation training algorithm for fully connected artificial neural networks, detailing its gradient‑descent basis, mathematical derivation, matrix formulation, and provides a complete Python implementation with mini‑batch stochastic gradient descent, momentum, learning‑rate decay, and experimental results.

BackpropagationMini-BatchNeural Network

0 likes · 14 min read

Backpropagation Algorithm for Fully Connected Neural Networks with Python Implementation

Hulu Beijing

Jan 4, 2018 · Artificial Intelligence

Why SGD Fails and How Momentum, AdaGrad, and Adam Fix It

This article explains why vanilla Stochastic Gradient Descent often struggles in deep learning, describes the challenges of valleys and saddle points, and introduces three major SGD variants—Momentum, AdaGrad, and Adam—detailing their motivations, update rules, and advantages.

AdaGradAdamMomentum

0 likes · 13 min read

Why SGD Fails and How Momentum, AdaGrad, and Adam Fix It

21CTO

Aug 21, 2015 · Artificial Intelligence

How Facebook Scales Recommendations with Distributed Machine Learning and Giraph

This article explains how Facebook tackles massive recommendation data—over 100 billion ratings—by using distributed collaborative filtering, matrix factorization, SGD/ALS hybrid algorithms, and a novel work‑to‑work communication scheme built on Apache Giraph to achieve high performance and scalability.

ALSApache GiraphFacebook

0 likes · 9 min read

How Facebook Scales Recommendations with Distributed Machine Learning and Giraph

Art of Distributed System Architecture Design

Aug 21, 2015 · Artificial Intelligence

Facebook’s Distributed Recommendation System: Architecture, Algorithms, and Performance

The article explains how Facebook built a large‑scale distributed recommendation system using Apache Giraph, collaborative filtering with matrix factorization, SGD and ALS algorithms, a novel work‑to‑work communication scheme, and performance optimizations that achieve ten‑fold speedups on billions of ratings.

ALSApache GiraphFacebook

0 likes · 9 min read