Artificial Intelligence 15 min read

Demystifying Neural Networks: A Mathematical Approach

The article explains how basic mathematical principles—starting with simple predictors and linear classifiers, then extending to multi‑classifier systems, activation functions, and weight adjustments—underpin neural network architecture, illustrating each step with concrete examples to show how mathematics drives AI model training and performance.

Tencent Cloud Developer

Nov 9, 2018

Demystifying Neural Networks: A Mathematical Approach

This article discusses the mathematical concepts behind simple neural networks, demonstrating how mathematics plays a crucial role in building AI models.

Before diving into neural networks, the article explores fundamental concepts starting with a simple predictor and classifier, then building up to neural networks themselves.

Simplified Predictor

The problem-solving process for humans versus computers is compared. A machine learning example converts kilometers to miles using the equation miles = kilometres x c, where c is an unknown constant. Through iterative testing with values 0.5, 0.6, and 0.7, the optimal value of approximately 0.6 is found, achieving an output of 61 miles (close to the correct 62.137). This demonstrates the simple prediction algorithm that adjusts parameters based on error from known examples.

Simplified Classifier

A classification problem distinguishes between caterpillars (long and thin) and ladybugs (wide and short) based on bug width and length. Linear functions produce straight lines, with the parameter 'c' defining the slope. The goal is to find a line that correctly classifies unknown bugs, with three possible scenarios when randomly placing a line—two failing to separate classes, and one successfully separating them.

Training the Classifier

Classifiers learn by comparing against training data (truth tables). Using the equation y = Ax, where y is bug length, x is width, and A is the slope, the training process involves calculating error as E = desired target - actual output. The parameter update formula is derived: δA = E / x. For example, with x=3.0 and target y=1.1, the calculated y=0.75 gives error E=0.35, leading to δA=0.1167 and updated A=0.3667.

Simplified Multi-Classifier

Neural networks consist of multiple classifiers working together. The limitations of simple linear classifiers are demonstrated through Boolean functions (AND, OR, XOR). While AND and OR can be solved with linear classifiers, XOR cannot—a single straight line cannot separate green (true) from red (false) regions. The solution uses multiple lines (multiple classifiers), which forms the fundamental principle of neural networks.

Neural Network Architecture

Neurons transmit electrical signals through dendrites along axons. Unlike linear functions, neurons use activation functions that trigger only when inputs reach certain thresholds.

Activation Functions:

· Step Function: Output is zero for low inputs, jumps at threshold. Good for binary classification but fails for multi-classifier problems.

· Sigmoid Function: Smoother than step function, formula: y = 1/(1+e^-x). Output range (0,1). Suffers from vanishing gradient problem where learning becomes very slow.

· Tanh Function: Scaled version of sigmoid, symmetric around origin, range (-1,1). Steeper gradients but also has vanishing gradient issues.

· ReLU (Rectified Linear Unit): Most widely used, formula: f(x) = max(0,x). Only activates a few neurons at a time, making networks manageable with lower computational cost. Also affected by gradient issues.

Artificial neurons receive multiple inputs, sum them, and pass through an activation function to control output. A three-layer neural network model shows nodes in each layer connected to all nodes in adjacent layers. Connection strength is adjusted through weights—low weights suppress signals while high weights amplify them. During training, some weights become zero, meaning those connections don't contribute to the network.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

neural networks activation function Backpropagation classifier Mathematics XOR problem

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.