Artificial Intelligence 6 min read

Neural Networks Explained: Architecture, Training, and Reinforcement Basics

This article introduces neural networks, covering their layered structure, common types like CNNs and RNNs, key components such as activation functions, loss, learning rate, backpropagation, dropout, batch normalization, and extends to reinforcement learning concepts including MDPs, policies, value functions, and Q‑learning.

Model Perspective
Model Perspective
Model Perspective
Neural Networks Explained: Architecture, Training, and Reinforcement Basics

Neural Networks

Neural networks are models built with layers. Common types include convolutional neural networks (CNNs) and recurrent neural networks (RNNs).

Structure

The architecture of a neural network is illustrated below.

Let \(l\) denote the \(l\)‑th layer and \(j\) the \(j\)‑th unit in that layer. The symbols represent weight, bias, and output respectively.

Activation Functions

Activation functions introduce non‑linear complexity at the end of hidden units. The most common are shown below.

Cross‑entropy loss

Learning Rate

The learning rate, often denoted \(\eta\), determines how much weights are updated at each step. It can be fixed or adaptive; Adam is a popular adaptive method.

Backpropagation

Backpropagation updates network weights by comparing actual and desired outputs. Gradients are computed using the chain rule, leading to weight updates.

Weight Update Procedure

Training proceeds in four steps: (1) take a batch of data, (2) forward propagation to compute loss, (3) backpropagation to obtain gradients, (4) update weights using the gradients.

Dropout

Dropout randomly removes units during training to prevent over‑fitting. The probability of dropping a unit is \(p\), and the probability of keeping it is \(1-p\).

Convolutional Neural Networks

Convolution Layer Requirements

Let \(n\) be the input size, \(f\) the filter size, and \(p\) the zero‑padding amount; the number of neurons that match the given volume is computed accordingly.

Batch Normalization

Batch normalization is a hyper‑parameter regularization applied after fully‑connected or convolutional layers and before non‑linearities, allowing higher learning rates and reducing dependence on initialization.

Recurrent Neural Networks

Gate Types

Input gate: decides whether to write to the neuron.

Forget gate: decides whether to erase the neuron.

Output gate: decides whether to expose the neuron.

Gate: determines how much to write.

Long Short‑Term Memory (LSTM)

LSTM adds a forget gate to RNNs to mitigate the gradient vanishing problem.

Reinforcement Learning and Control

Reinforcement learning aims to teach an agent to evolve within an environment.

Markov Decision Processes (MDP)

An MDP is a 5‑tuple \((S, A, P, \gamma, R)\) where \(S\) is a set of states, \(A\) a set of actions, \(P\) the state‑transition probabilities, \(\gamma\) the discount factor, and \(R\) the reward function to be maximized.

Policy

A policy \(\pi\) maps states to actions.

Note: given a state \(s\) and a policy \(\pi\), the resulting action is \(a\).

Value Function

For a given policy \(\pi\) and state \(s\), the value function \(V^{\pi}(s)\) is defined as the expected return from that state following \(\pi\).

Bellman Equation

The optimal Bellman equation describes the value function of the optimal policy.

Note: for a state \(s\), the optimal policy \(\pi^*\) satisfies the Bellman optimality condition.

Value Iteration Algorithm

The algorithm consists of two steps: (1) initialize values, (2) iteratively update values based on previous estimates.

Maximum Likelihood Estimate

The maximum likelihood estimate of state transition probabilities is obtained from observed state‑action frequencies.

Q‑learning

Q‑learning is a model‑free algorithm with the update rule \(Q(s,a) \leftarrow Q(s,a) + \alpha [r + \gamma \max_{a'} Q(s',a') - Q(s,a)]\).

Reference: MLEveryday https://github.com/MLEveryday/Machine-Learning-Cheatsheets

CNNMachine Learningdeep learningNeural Networksreinforcement learningRNN
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.