Numpy‑ML: A Pure NumPy Implementation of Machine Learning Algorithms
The Numpy‑ML project, created by UC Berkeley’s David Bourgin, provides a comprehensive pure‑NumPy implementation of over 30 machine‑learning algorithms—including probabilistic models, neural‑network layers, optimizers, and reinforcement‑learning agents—along with extensive data‑preprocessing utilities, all in a single open‑source repository.
The Numpy‑ML repository (https://github.com/ddbourgin/numpy-ml) is a pure‑NumPy implementation of a wide range of machine‑learning algorithms, authored by UC Berkeley researcher David Bourgin. The project contains more than 30,000 lines of code, covering both core algorithms and extensive data‑preprocessing utilities.
Implemented algorithms include:
Gaussian Mixture Model
EM training
Hidden Markov Model
Viterbi decoding
Likelihood computation
Parameter estimation via Baum‑Welch / forward‑backward (MLE)
Latent Dirichlet Allocation (topic model)
Variational EM for MLE
MCMC for MAP smoothing
Neural Networks
Layers / operations : Add, Flatten, Multiply, Softmax, Dense, Sparse Evolutionary Connections, LSTM, Elman‑style RNN, Max/Avg pooling, Dot‑product attention, RBM (with CD‑n), 2D transposed convolution, 2D/1D convolutions (with padding, dilation, stride, causality).
Modules : Bidirectional LSTM, ResNet‑style residual blocks, WaveNet‑style dilated causal blocks, Transformer‑style multi‑head scaled dot‑product attention.
Regularization : Dropout, Normalization, Batch Normalization (spatial & temporal), Layer Normalization (spatial & temporal).
Optimizers : SGD with momentum, AdaGrad, RMSProp, Adam.
Learning‑rate schedulers : Constant, Exponential, Noam/Transformer, Dlib scheduler.
Weight initializers : Glorot/Xavier (uniform & normal), He/Kaiming (uniform & normal), Standard & truncated normal.
Loss functions : Cross‑entropy, Mean‑squared error, Bernoulli VAE loss, Wasserstein loss with gradient penalty.
Activations : ReLU, Tanh, Affine, Sigmoid, Leaky ReLU.
Models : Bernoulli Variational Auto‑Encoder, Wasserstein GAN with gradient penalty.
Neural‑network utilities : col2im / im2col (MATLAB ports), conv1D, conv2D, deconv2D, minibatch handling.
Tree‑based models
Decision Tree (CART)
Random Forest (Bagging)
Gradient‑Boosted Decision Trees (Boosting)
Linear models
Ridge Regression
Logistic Regression
Ordinary Least Squares
Bayesian Linear Regression with conjugate priors
n‑gram sequence models
Maximum‑likelihood scoring
Additive/Lidstone smoothing
Simple Good‑Turing smoothing
Reinforcement‑learning agents
Cross‑entropy method agent
On‑policy Monte‑Carlo agent
Weighted incremental importance‑sampling Monte‑Carlo agent
Expected SARSA agent
TD‑0 Q‑learning agent
Dyna‑Q / Dyna‑Q+ with prioritized sweeping
Non‑parametric models
Nadaraya‑Watson kernel regression
k‑Nearest Neighbour classification & regression
Preprocessing utilities
Discrete Fourier Transform (1D)
Bilinear interpolation (2D)
Nearest‑neighbour interpolation (1D & 2D)
Autocorrelation (1D)
Signal windowing
Text tokenization
Feature hashing
Feature standardization
One‑hot encode/decode
Huffman encode/decode
TF‑IDF encoding
Utility tools
Similarity kernels
Distance metrics
Priority queue
Ball‑tree data structure
The project’s goal is to deepen understanding of algorithms through hand‑crafted implementations rather than replace mature frameworks, and it serves as a valuable learning resource for Python‑based machine‑learning research.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
