Artificial Intelligence 22 min read

An Overview of Machine Learning and Deep Learning: Definitions, Core Concepts, and Typical Architectures

This article provides a comprehensive introduction to machine learning and deep learning, covering their definitions, differences, key concepts such as generalization, regularization, and overfitting, as well as typical algorithms and network architectures like CNN and RNN, illustrated with numerous diagrams.

Hujiang Technology

Oct 11, 2017

An Overview of Machine Learning and Deep Learning: Definitions, Core Concepts, and Typical Architectures

Introduction

Artificial intelligence (AI) is often called the new industrial revolution: algorithms and computing power replace repetitive mental labor, freeing humans to focus on creative and scientific work and dramatically increasing productivity.

Figure 1 illustrates the impact of the industrial revolution; Figure 2 shows how AI replaces repetitive mental tasks, further boosting productivity.

About Machine Learning

Machine learning (ML) enables computers to discover patterns from large historical datasets using statistical algorithms, producing models that can recognize new samples or predict future outcomes without explicit programming. An ML system consists of data, algorithms, and models; data + algorithm → model → prediction or pattern recognition.

Figure 3 compares machine learning to cooking to aid understanding.

Learning Paradigms

ML algorithms can be classified by learning paradigm: supervised, unsupervised, semi‑supervised, and reinforcement learning.

1. Supervised Learning

Training data are labeled; models are iteratively adjusted until predictions reach a desired accuracy. Typical tasks include classification and regression, using algorithms such as Logistic Regression and Back‑Propagation Neural Networks.

2. Unsupervised Learning

Data are unlabeled; the goal is to infer intrinsic structure, e.g., clustering or association rule mining (Apriori, K‑Means).

3. Semi‑Supervised Learning

Only part of the data are labeled; models first learn the underlying structure and then leverage the labeled portion for prediction (e.g., Graph Inference, Laplacian SVM).

4. Reinforcement Learning

Feedback is provided as rewards rather than explicit labels; agents adjust policies in real time (e.g., Q‑Learning, Temporal‑Difference learning).

Algorithm Similarity

Algorithms are also grouped by functional similarity, such as tree‑based methods, neural‑network‑based methods, etc.

1. Regression Algorithms

Methods that model relationships between variables, including Ordinary Least Squares, Logistic Regression, Stepwise Regression, Multivariate Adaptive Regression Splines, and LOESS.

2. Instance‑Based Algorithms

Techniques that compare new samples to stored instances, such as K‑Nearest Neighbors, Learning Vector Quantization, and Self‑Organizing Maps.

3. Regularization Methods

Extensions of regression that control model complexity to avoid over‑fitting, e.g., Ridge Regression, LASSO, Elastic Net.

4. Decision‑Tree Learning

Tree‑structured models for classification and regression, including CART, ID3, C4.5, CHAID, Decision Stump, Random Forest, MARS, and Gradient Boosting Machines.

5. Kernel‑Based Methods

Algorithms that map data into high‑dimensional spaces, most notably Support Vector Machines, Radial Basis Function kernels, and Linear Discriminant Analysis.

6. Clustering Algorithms

Methods that group data by similarity, such as K‑Means and Expectation‑Maximization.

7. Association‑Rule Learning

Techniques that discover useful relationships in large datasets, e.g., Apriori and Eclat.

8. Dimensionality‑Reduction Algorithms

Approaches that uncover intrinsic structure while reducing dimensionality, including PCA, PLS, Sammon Mapping, MDS, and Projection Pursuit.

9. Ensemble Methods

Combine multiple weak learners to improve overall performance; examples are Boosting, Bagging, AdaBoost, Stacking, Gradient Boosting, and Random Forest.

Figure 4 shows a roadmap of ML classification and practice.

About Deep Learning

Deep learning (DL) is a specialized subset of ML that represents the world as a hierarchy of nested concepts, from simple to complex, enabling powerful and flexible modeling.

Figure 5 illustrates the hierarchical concept system of deep learning.

Figure 6 depicts a typical deep network: visible layers receive raw inputs (e.g., pixels), followed by successive hidden layers that extract increasingly abstract features, culminating in an output layer that performs the final task.

Differences Between Deep Learning and Machine Learning

Data scale: DL requires massive datasets (often millions of labeled samples) to achieve or surpass human performance.

Feature handling: DL automatically learns feature representations, whereas traditional ML relies on manually engineered features.

Figure 7 (Venn diagram) shows that deep learning is both representation learning and a form of machine learning.

Figure 8 illustrates how AI system components relate across different AI disciplines, highlighting the parts that learn from data.

Fundamental Concepts of Neural Networks

A neural network consists of layers of neurons connected by weighted edges and activation functions. The input layer receives data, hidden layers transform representations, and the output layer produces predictions. Training adjusts the weights via back‑propagation.

Core Deep‑Learning Concepts

1. Generalization

Generalization is the ability of a model to perform well on unseen data; it is the primary challenge in ML and DL.

2. Basic Assumptions

Smoothness prior and local constancy prior assume that the target function does not change abruptly in small neighborhoods.

Manifold learning assumes high‑dimensional data lie on a lower‑dimensional manifold that can be uncovered.

3. Representation

Good representations simplify learning; common forms include low‑dimensional, sparse, and independent representations (see Figure 10).

4. Error, Over‑fitting, Under‑fitting, Capacity

Training error vs. generalization error.

Over‑fitting: large gap between training and test error.

Under‑fitting: model cannot achieve low training error.

Capacity: ability of a model to fit various functions; Figure 11 shows the typical U‑shaped relationship.

5. Optimization, Regularization, Hyper‑parameters

Training seeks parameters that minimize a (often non‑convex) loss function, typically using mini‑batch gradient descent. Regularization (L1, L2, Dropout) mitigates over‑fitting, while hyper‑parameters such as learning rate and regularization weight are tuned manually or via search.

Convolutional Neural Networks (CNN)

CNNs are suited for data with spatial structure (e.g., images). They consist of convolutional layers that apply learnable filters and pooling layers that down‑sample feature maps.

Figure 13 illustrates convolution: a filter slides over the input image, performing element‑wise multiplication and summation to produce a feature map.

Figure 14 shows max‑pooling, which reduces spatial dimensions by retaining the maximum value within each region.

Recurrent Neural Networks (RNN)

RNNs handle sequential data such as text and speech. Units share parameters across time steps, passing a hidden state forward; Figure 15 depicts a single RNN cell.

During training, the network is often unrolled into a fixed‑length computation graph (Figure 16).

Reflections on Deep Learning

With frameworks like TensorFlow and Caffe, building and deploying ML/DL models has become much easier, shifting engineers' focus from algorithmic details to application design and accelerating development cycles.

Figure 17 visualizes the perception that constructing DL models is akin to stacking building blocks.

Figure 18 highlights that only a small fraction of a real‑world ML system is the ML code itself; the surrounding infrastructure is vast and complex.

machine learning neural networks Algorithms

Written by

Hujiang Technology

We focus on the real-world challenges developers face, delivering authentic, practical content and a direct platform for technical networking among developers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.