Understanding Gradient Descent: Basics, Advantages, and Limitations
This article explains the fundamental principle of gradient descent as the steepest‑descent optimization method, derives its direction using Taylor expansion and the Cauchy‑Schwarz inequality, illustrates why it can be slow on functions like Rosenbrock, and discusses its advantages and convergence properties.
In modern deep learning, optimization is core, and gradient descent is the oldest and simplest unconstrained optimization algorithm.
Gradient descent moves in the direction of the negative gradient, which yields the steepest decrease of the objective function.
Derivation: using a first‑order Taylor expansion of f(x) at point x and a step size α>0, the decrease condition leads to choosing the direction d = -∇f(x).
By the Cauchy‑Schwarz inequality, the maximal decrease is achieved when d is parallel to -∇f(x); the equality holds when the vectors are collinear.
Although called “steepest”, the method is only locally steep; globally it can converge very slowly, especially on ill‑conditioned functions such as the Rosenbrock (banana) function.
The zig‑zag trajectory shown in the figures illustrates how the step size shrinks and the search direction becomes orthogonal between successive iterations.
Advantages: simple implementation, low computational cost, no special requirements on the initial point, and it provides the search direction for many advanced algorithms.
Convergence: gradient descent with exact line search guarantees global convergence, but the convergence rate is linear.
References: Wikipedia pages on Cauchy‑Schwarz inequality, gradient descent, and Rosenbrock function.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
360 Tech Engineering
Official tech channel of 360, building the most professional technology aggregation platform for the brand.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
