Artificial Intelligence 6 min read

Numpy‑ML: A Pure NumPy Implementation of Machine Learning Algorithms

The Numpy‑ML project, created by UC Berkeley’s David Bourgin, provides a comprehensive pure‑NumPy implementation of over 30 machine‑learning algorithms—including probabilistic models, neural‑network layers, optimizers, and reinforcement‑learning agents—along with extensive data‑preprocessing utilities, all in a single open‑source repository.

Python Programming Learning Circle

Jul 27, 2024

Numpy‑ML: A Pure NumPy Implementation of Machine Learning Algorithms

The Numpy‑ML repository (https://github.com/ddbourgin/numpy-ml) is a pure‑NumPy implementation of a wide range of machine‑learning algorithms, authored by UC Berkeley researcher David Bourgin. The project contains more than 30,000 lines of code, covering both core algorithms and extensive data‑preprocessing utilities.

Implemented algorithms include:

Gaussian Mixture Model

EM training

Hidden Markov Model

Viterbi decoding

Likelihood computation

Parameter estimation via Baum‑Welch / forward‑backward (MLE)

Latent Dirichlet Allocation (topic model)

Variational EM for MLE

MCMC for MAP smoothing

Neural Networks

Layers / operations : Add, Flatten, Multiply, Softmax, Dense, Sparse Evolutionary Connections, LSTM, Elman‑style RNN, Max/Avg pooling, Dot‑product attention, RBM (with CD‑n), 2D transposed convolution, 2D/1D convolutions (with padding, dilation, stride, causality).

Modules : Bidirectional LSTM, ResNet‑style residual blocks, WaveNet‑style dilated causal blocks, Transformer‑style multi‑head scaled dot‑product attention.

Regularization : Dropout, Normalization, Batch Normalization (spatial & temporal), Layer Normalization (spatial & temporal).

Optimizers : SGD with momentum, AdaGrad, RMSProp, Adam.

Learning‑rate schedulers : Constant, Exponential, Noam/Transformer, Dlib scheduler.

Weight initializers : Glorot/Xavier (uniform & normal), He/Kaiming (uniform & normal), Standard & truncated normal.

Loss functions : Cross‑entropy, Mean‑squared error, Bernoulli VAE loss, Wasserstein loss with gradient penalty.

Activations : ReLU, Tanh, Affine, Sigmoid, Leaky ReLU.

Models : Bernoulli Variational Auto‑Encoder, Wasserstein GAN with gradient penalty.

Neural‑network utilities : col2im / im2col (MATLAB ports), conv1D, conv2D, deconv2D, minibatch handling.

Tree‑based models

Decision Tree (CART)

Random Forest (Bagging)

Gradient‑Boosted Decision Trees (Boosting)

Linear models

Ridge Regression

Logistic Regression

Ordinary Least Squares

Bayesian Linear Regression with conjugate priors

n‑gram sequence models

Maximum‑likelihood scoring

Additive/Lidstone smoothing

Simple Good‑Turing smoothing

Reinforcement‑learning agents

Cross‑entropy method agent

On‑policy Monte‑Carlo agent

Weighted incremental importance‑sampling Monte‑Carlo agent

Expected SARSA agent

TD‑0 Q‑learning agent

Dyna‑Q / Dyna‑Q+ with prioritized sweeping

Non‑parametric models

Nadaraya‑Watson kernel regression

k‑Nearest Neighbour classification & regression

Preprocessing utilities

Discrete Fourier Transform (1D)

Bilinear interpolation (2D)

Nearest‑neighbour interpolation (1D & 2D)

Autocorrelation (1D)

Signal windowing

Text tokenization

Feature hashing

Feature standardization

One‑hot encode/decode

Huffman encode/decode

TF‑IDF encoding

Utility tools

Similarity kernels

Distance metrics

Priority queue

Ball‑tree data structure

The project’s goal is to deepen understanding of algorithms through hand‑crafted implementations rather than replace mature frameworks, and it serves as a valuable learning resource for Python‑based machine‑learning research.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning Python AI open source Algorithms NumPy

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.