Artificial Intelligence 9 min read

Visualizing How Neural Networks Approximate Any Function

This article explains the universal approximation theorem, showing how even a simple neural network with one hidden layer can approximate any continuous function by adjusting weights and biases, and illustrates the process with visual examples of step and bump functions, linking theory to recent Nobel recognitions.

Model Perspective
Model Perspective
Model Perspective
Visualizing How Neural Networks Approximate Any Function

The latest Nobel Prizes in Physics and Chemistry were awarded to scientists whose work relies heavily on AI, particularly neural network algorithms.

Neural networks, the cornerstone of modern AI, possess the remarkable ability to approximate any function, a property known as the universal approximation theorem . In other words, given enough neurons and layers, a network can closely compute the values of even the most complex functions.

Complex Function

Consider a complex function that combines polynomials, sine, cosine, and other components, making manual computation at each point cumbersome.

Simple Neural Network

To understand the approximation process, imagine a basic network with an input layer, one hidden layer containing two neurons, and an output layer.

<code>输入 (x) → 隐藏层 (h1, h2) → 输出 (y)</code>

Each neuron performs a weighted sum of its inputs followed by a non‑linear activation, typically the sigmoid function.

The mathematical formulation assumes an input x , weights w , and biases b for the connections from the input to the hidden layer and from the hidden layer to the output.

Approximating a Step Function

By visualizing the network’s behavior, we can see how it gradually approximates a step function.

Changing Bias

Adjusting the bias shifts the sigmoid curve left or right, altering the neuron’s response to different input values.

These bias adjustments enable neurons to respond differently across the input range.

Adjusting Weights

Increasing the weight makes the sigmoid curve steeper, approaching a step function; decreasing the weight yields a smoother curve.

When the weight is set very high, the network’s output closely resembles a step function, which can be combined to approximate more complex nonlinear functions.

Approximating a Bump Function

Using two adjacent step functions, the network can generate a “bump function”. By tuning the weights and biases of multiple hidden neurons, each bump can be positioned and shaped precisely, allowing the network to approximate the target complex function piecewise.

Adding more hidden neurons creates additional bump functions, enhancing the overall approximation accuracy.

Improving Approximation Ability

Increasing the number of hidden neurons continuously strengthens the network’s ability to approximate the target function, eventually achieving arbitrary precision as guaranteed by the universal approximation theorem.

The theorem states that a neural network with a single hidden layer and sufficiently many neurons can approximate any continuous function, regardless of its complexity.

Multiple Inputs and Outputs

In real applications, networks often handle many inputs and outputs, such as image classification where millions of pixel values map to a single label. Each input has its own weight, and the network learns to extract key patterns from the combined inputs.

Although the universal approximation theorem guarantees the existence of such networks, constructing efficient models in practice requires large datasets, optimization techniques, and sufficient computational resources.

The 2024 Nobel Prizes in Physics were awarded to John J. Hopfield and Geoffrey E. Hinton for their foundational contributions to artificial neural networks, underscoring the pivotal role of AI in modern science.

These breakthroughs demonstrate how neural network theory, especially the universal approximation theorem, enables machines to emulate complex learning and memory processes, paving the way for future innovations across all fields.

The visual proof ideas presented here are drawn from the book Deep Learning and Neural Networks Made Easy , which is highly recommended for beginners.

AIdeep learningneural networksfunction approximationuniversal approximation theorem
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.