Visualizing How Neural Networks Approximate Any Function
This article explains the universal approximation theorem, showing how even a simple neural network with one hidden layer can approximate any continuous function by adjusting weights and biases, and illustrates the process with visual examples of step and bump functions, linking theory to recent Nobel recognitions.
The latest Nobel Prizes in Physics and Chemistry were awarded to scientists whose work relies heavily on AI, particularly neural network algorithms.
Neural networks, the cornerstone of modern AI, possess the remarkable ability to approximate any function, a property known as the universal approximation theorem . In other words, given enough neurons and layers, a network can closely compute the values of even the most complex functions.
Complex Function
Consider a complex function that combines polynomials, sine, cosine, and other components, making manual computation at each point cumbersome.
Simple Neural Network
To understand the approximation process, imagine a basic network with an input layer, one hidden layer containing two neurons, and an output layer.
<code>输入 (x) → 隐藏层 (h1, h2) → 输出 (y)</code>Each neuron performs a weighted sum of its inputs followed by a non‑linear activation, typically the sigmoid function.
The mathematical formulation assumes an input x , weights w , and biases b for the connections from the input to the hidden layer and from the hidden layer to the output.
Approximating a Step Function
By visualizing the network’s behavior, we can see how it gradually approximates a step function.
Changing Bias
Adjusting the bias shifts the sigmoid curve left or right, altering the neuron’s response to different input values.
These bias adjustments enable neurons to respond differently across the input range.
Adjusting Weights
Increasing the weight makes the sigmoid curve steeper, approaching a step function; decreasing the weight yields a smoother curve.
When the weight is set very high, the network’s output closely resembles a step function, which can be combined to approximate more complex nonlinear functions.
Approximating a Bump Function
Using two adjacent step functions, the network can generate a “bump function”. By tuning the weights and biases of multiple hidden neurons, each bump can be positioned and shaped precisely, allowing the network to approximate the target complex function piecewise.
Adding more hidden neurons creates additional bump functions, enhancing the overall approximation accuracy.
Improving Approximation Ability
Increasing the number of hidden neurons continuously strengthens the network’s ability to approximate the target function, eventually achieving arbitrary precision as guaranteed by the universal approximation theorem.
The theorem states that a neural network with a single hidden layer and sufficiently many neurons can approximate any continuous function, regardless of its complexity.
Multiple Inputs and Outputs
In real applications, networks often handle many inputs and outputs, such as image classification where millions of pixel values map to a single label. Each input has its own weight, and the network learns to extract key patterns from the combined inputs.
Although the universal approximation theorem guarantees the existence of such networks, constructing efficient models in practice requires large datasets, optimization techniques, and sufficient computational resources.
The 2024 Nobel Prizes in Physics were awarded to John J. Hopfield and Geoffrey E. Hinton for their foundational contributions to artificial neural networks, underscoring the pivotal role of AI in modern science.
These breakthroughs demonstrate how neural network theory, especially the universal approximation theorem, enables machines to emulate complex learning and memory processes, paving the way for future innovations across all fields.
The visual proof ideas presented here are drawn from the book Deep Learning and Neural Networks Made Easy , which is highly recommended for beginners.
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.