How to Build a Breast Cancer Prediction Neural Network from Scratch in Python
This article walks through creating a Python‑based neural network to predict breast cancer using the Wisconsin dataset, covering network architecture, weight and bias initialization, back‑propagation, gradient descent, and the role of activation functions such as sigmoid, tanh, ReLU and Leaky ReLU.
In constructing a breast cancer prediction neural network, the process is divided into three major parts: (1) creating a neural network from scratch in Python and training it with gradient descent; (2) applying the Wisconsin breast cancer dataset with nine features to classify tumors as benign or malignant; (3) exploring how back‑propagation and gradient descent work.
While many powerful libraries like TensorFlow, PyTorch, Fast.ai, Keras, MXNet, DL4J, etc., exist, building the model manually helps uncover essential concepts that would be missed by using high‑level APIs alone.
Understanding Functions and Supervised Learning
Functions map inputs to outputs, and in machine learning three common learning types exist: supervised learning (learning from labeled data), unsupervised learning (learning from unlabeled data), and reinforcement learning (learning from rewards). This tutorial focuses on supervised learning, using a dataset of input‑output pairs to discover the underlying function.
Simple Two‑Layer Neural Network
A basic network consists of an input layer (matching the number of features, e.g., nine for the Wisconsin dataset), a hidden layer, and an output layer. Although deeper networks are possible, a two‑layer structure is sufficient to illustrate core ideas.
Each neuron has a weight and a bias. The linear combination is computed as W*X+b, where W represents weights, X the input vector, and b the bias term.
Activation Functions
Linear functions alone cannot model complex, non‑linear relationships, so activation functions are added after each linear transformation. Four typical activation functions are discussed:
Sigmoid : 1/(1+e**-x), output range [0,1], suitable for binary classification but prone to gradient vanishing.
Tanh : (2/(1+e**-2x))-1, output range [-1,1], steeper than sigmoid but shares similar drawbacks.
ReLU : max(0,x), output range [0,∞), simple and efficient, yet can cause dead neurons when inputs are zero.
Leaky ReLU : normalizes inputs into a probability distribution, often used in multi‑class output layers.
These functions introduce non‑linearity, enabling the network to approximate complex mappings, but they also bring challenges such as gradient vanishing and exploding.
Putting It All Together
For a two‑layer network, the forward computation becomes:
Ŷ = A2 = Sigmoid(W2*ReLU(W1*X + b1) + b2)The network must learn the unknown parameters W and b. Training involves initializing these parameters randomly and then iteratively updating them using gradient descent based on the loss between the predicted output Ŷ and the true target Y.
Finally, the tutorial hints at implementing the entire process in Python by defining a class that handles parameter initialization and the forward‑backward passes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
