Artificial Intelligence 8 min read

Neural Network Fundamentals: Building Your Own Neural Network from Scratch in Python

This tutorial explains neural network fundamentals by defining layers, weights, biases, and sigmoid activation, then walks through building a Python class that implements forward propagation, a sum‑of‑squared‑error loss, and backpropagation using the chain rule and gradient descent to train a simple two‑layer network.

Tencent Cloud Developer

Oct 15, 2018

Neural Network Fundamentals: Building Your Own Neural Network from Scratch in Python

Most introductory articles about neural networks mention the brain analogy. However, a simpler way to describe neural networks is as mathematical functions that map given inputs to desired outputs.

A neural network consists of the following components:

· Input layer , x

· Any number of hidden layers

· Output layer , ŷ

· Weights and biases between each layer, W and b

· For each hidden layer, choose an activation function , σ. In this tutorial, we use the Sigmoid activation function.

A simple 2-layer neural network is shown in the diagram (note that when counting layers in a neural network, the input layer is typically excluded).

Creating a neural network class in Python is straightforward.

Neural Network Training

The output of a simple 2-layer neural network ŷ depends on weights W and biases b. The correct values of weights and biases determine the strength of predictions. The process of fine-tuning weights and biases from input data is called training the neural network .

Each iteration of the training process consists of:

· Calculating the predicted output ŷ, known as forward propagation

· Updating weights and biases, known as backpropagation

Forward Propagation

As shown in the sequence diagram, forward propagation is simple calculation. For a basic 2-layer neural network, the output is calculated through a specific formula.

We can add a forward function in Python code. For simplicity, we assume bias is zero.

Loss Function

There are many loss functions available, and the choice depends on the nature of the problem. In this tutorial, we use a simple sum of squared errors as our loss function.

That is, the sum of squared errors is simply the sum of differences between each predicted value and the actual value. The differences are squared, so we measure the absolute value of the differences.

The goal of training is to find the best set of weights and biases that minimize the loss function.

Backpropagation

Now that we have measured the error (loss) of our predictions, we need to find a way to propagate the error back and update our weights and biases.

To know the appropriate amount to adjust weights and biases, we need to know the derivative of the loss function with respect to weights and biases .

Recall from calculus that the derivative of a function is simply the slope of the function.

If we have the derivative, we can simply update weights and biases by increasing/decreasing them. This is called gradient descent .

However, since the loss function equation does not directly contain weights and biases, we cannot directly calculate the derivative of the loss function with respect to weights and biases. Therefore, we need the chain rule to help us calculate.

Summary

Now that we have the complete Python code for forward propagation and backpropagation, let's apply our neural network to an example to see how well it performs.

Our neural network should learn the ideal set of weights to represent this function. Note that calculating the weights just by inspection is not simple for us.

Let's train the neural network for 1500 iterations and see what happens. Looking at each iteration plot, we can clearly see the loss monotonically decreasing toward the minimum . This is consistent with the gradient descent algorithm discussed earlier.

Let's look at the final predictions (output) from the neural network after 1500 iterations.

We did it! Our forward propagation and backpropagation algorithms successfully trained the neural network, and the predictions converged to the true values.

Note that there is a slight difference between predicted and actual values. This is desirable because it prevents overfitting .

Fortunately, our journey is not over yet. There is much more to neural networks and deep learning, such as:

· Can we use other activation functions besides Sigmoid?

· Using learning rate for neural network training

· Using convolution for image classification tasks

Building your own neural network from scratch teaches you a lot. Although deep learning libraries like TensorFlow and Keras make it easy to build deep networks without fully understanding the internal workings of neural networks, I find that having a deeper understanding of neural networks is very important for becoming an excellent data scientist in the future.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Neural Network Python gradient descent activation function Backpropagation forward propagation

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.