Tagged articles

MLP

17 articles · Page 1 of 1

Nov 4, 2025 · Artificial Intelligence

Can Tiny Networks Beat Giant LLMs? Inside the Tiny Recursive Model (TRM) Breakthrough

A recent study from Samsung's SAIL Montreal lab shows that a 7‑million‑parameter, two‑layer Tiny Recursive Model can surpass large language models on challenging reasoning benchmarks by using recursive self‑correction instead of attention, offering a new efficient path for AI inference.

LLM comparisonMLPefficient-ai

0 likes · 7 min read

Can Tiny Networks Beat Giant LLMs? Inside the Tiny Recursive Model (TRM) Breakthrough

Bighead's Algorithm Notes

Oct 21, 2025 · Artificial Intelligence

KANMixer: A New KAN‑Centric Paradigm for Long‑Term Time Series Forecasting

This article reviews the KANMixer model, which places Kolmogorov‑Arnold Networks at the core of a lightweight architecture for long‑term time series forecasting, detailing its design, extensive benchmark experiments on seven real‑world datasets, ablation analyses, and its computational trade‑offs versus MLP and Transformer baselines.

Ablation StudyKANLong-term Time Series Forecasting

0 likes · 8 min read

KANMixer: A New KAN‑Centric Paradigm for Long‑Term Time Series Forecasting

AI Algorithm Path

Oct 20, 2025 · Artificial Intelligence

Building a Flow Matching Model from Scratch: Complete Code Walkthrough

This article walks through the full implementation of a flow‑matching generative model in PyTorch, covering dataset creation, a small MLP that learns a time‑dependent velocity field, the flow‑matching loss, training loop, ODE‑based sampling, visualisation of the learned vector field, and a discussion of the method's limitations and possible extensions.

MLPPyTorchflow matching

0 likes · 13 min read

Building a Flow Matching Model from Scratch: Complete Code Walkthrough

21CTO

Aug 15, 2023 · Artificial Intelligence

Why Do Neural Networks Suddenly ‘Grok’ After Long Training? Insights from Google

Google’s recent research reveals that when small neural networks are trained for extended periods on tasks like modular addition, they can abruptly shift from memorizing training data to genuinely generalizing—a sudden “grokking” phenomenon driven by weight decay and the emergence of periodic weight structures.

AI researchMLPWeight Decay

0 likes · 9 min read

Why Do Neural Networks Suddenly ‘Grok’ After Long Training? Insights from Google

Alimama Tech

Oct 19, 2022 · Artificial Intelligence

Understanding the One-Epoch Overfitting Phenomenon in Deep Click-Through Rate Models

The study reveals that industrial deep click‑through‑rate models often overfit dramatically after the first training epoch—a “one‑epoch phenomenon” caused by the embedding‑plus‑MLP architecture, fast optimizers, and highly sparse features, with performance dropping sharply unless sparsity is reduced or training is limited to a single pass.

CTREmbeddingMLP

0 likes · 15 min read

Understanding the One-Epoch Overfitting Phenomenon in Deep Click-Through Rate Models

Model Perspective

Aug 9, 2022 · Artificial Intelligence

Build a Regression MLP with Keras: Predict California Housing Prices

Learn how to load the California housing dataset, preprocess features, construct a Keras sequential regression MLP, train it with SGD, evaluate performance, and make predictions, all illustrated with concise Python code snippets.

California HousingKerasMLP

0 likes · 3 min read

Build a Regression MLP with Keras: Predict California Housing Prices

Model Perspective

Aug 8, 2022 · Artificial Intelligence

Build a Multi‑Layer Perceptron with Keras: Step‑by‑Step Guide

This tutorial walks through using Keras to create, compile, train, and evaluate a multi‑layer perceptron for image classification on the Fashion MNIST dataset, covering data loading, model construction with the Sequential API, hyperparameter choices, and prediction of new samples.

Fashion-MNISTKerasMLP

0 likes · 16 min read

Build a Multi‑Layer Perceptron with Keras: Step‑by‑Step Guide

Python Programming Learning Circle

Jul 4, 2022 · Artificial Intelligence

Building an Advertising Recommendation Model with Python and PyTorch

This article walks through the development of a simple advertising recommendation system using Python, covering data collection, preprocessing with label encoding, text embedding via Torch, constructing an MLP model, and initiating training, while reflecting on the challenges faced by Python developers in the big‑data era.

EmbeddingMLPPyTorch

0 likes · 5 min read

Building an Advertising Recommendation Model with Python and PyTorch

Sohu Tech Products

Jul 21, 2021 · Artificial Intelligence

Kaggle Jane Street Market Prediction Competition Summary and Model Insights

This article summarizes the author's participation in the Kaggle Jane Street Market Prediction competition, detailing the anonymous feature dataset, utility‑score metric, data preprocessing, the combined AE‑MLP and XGBoost modeling approach, threshold tuning, experimental findings, and references for further study.

AutoencoderKaggleMLP

0 likes · 8 min read

Kaggle Jane Street Market Prediction Competition Summary and Model Insights

360 Tech Engineering

Aug 28, 2019 · Artificial Intelligence

Understanding TensorFlow Internals with TensorSlow: Computational Graph, Forward/Backward Propagation, and Building an MLP

This article explains how Huajiao Live leverages Spark for data preprocessing and TensorFlow (augmented by the TensorSlow project) for distributed deep‑learning training, detailing computational‑graph concepts, forward and backward propagation, loss construction, gradient‑descent optimization, and a step‑by‑step Python implementation of a multi‑layer perceptron.

Computational GraphMLPPython

0 likes · 14 min read

Understanding TensorFlow Internals with TensorSlow: Computational Graph, Forward/Backward Propagation, and Building an MLP

iQIYI Technical Product Team

Aug 9, 2019 · Artificial Intelligence

iQIYI 2019 Multimodal Video Person Recognition Competition Report by Zheey Team

The Zheey team from Beijing University of Posts and Telecommunications tackled the iQIYI 2019 Multimodal Video Person Recognition Challenge with a three‑layer MLP on official face features, boosting a baseline 0.8742 to 0.8949 through model fusion, quality filtering and fine‑tuning, ultimately ranking sixth and open‑sourcing their code.

MLPMultimodalcompetition

0 likes · 9 min read

iQIYI 2019 Multimodal Video Person Recognition Competition Report by Zheey Team

iQIYI Technical Product Team

Jul 19, 2019 · Artificial Intelligence

Face Quality‑Driven Feature Denoising and Fusion for iQIYI‑VID‑2019 Video Person Recognition

The seefun team leveraged face detection scores and quality metrics to denoise and weight‑fuse facial features during training and testing, using a three‑layer MLP with Swish activation and dropout, and achieved a 0.8983 mAP (fourth place) on the iQIYI‑VID‑2019 video person‑recognition challenge.

MLPface quality weightingfeature fusion

0 likes · 10 min read

Face Quality‑Driven Feature Denoising and Fusion for iQIYI‑VID‑2019 Video Person Recognition

360 Tech Engineering

Jul 2, 2019 · Artificial Intelligence

Understanding TensorFlow Internals with TensorSlow: A Deep Learning Guide

This article explains how TensorFlow powers Huajiao Live's recommendation system, introduces the TensorSlow project for demystifying TensorFlow's core, and walks through deep‑learning fundamentals, computational‑graph concepts, forward and backward propagation, loss construction, gradient‑descent optimization, and building a multi‑layer perceptron with Python code examples.

Computational GraphMLPPython

0 likes · 13 min read

Understanding TensorFlow Internals with TensorSlow: A Deep Learning Guide

Huajiao Technology

Jul 2, 2019 · Artificial Intelligence

Understanding Deep Learning with TensorFlow: Applications, Computational Graphs, and MLP Implementation

This article introduces deep learning applications at Huajiao Live, explains TensorFlow's computational graph architecture, details core concepts such as placeholders, variables, operations, forward and backward propagation, and provides complete Python-like code examples for building and training a multi-layer perceptron.

Computational GraphMLPPython

0 likes · 14 min read

Understanding Deep Learning with TensorFlow: Applications, Computational Graphs, and MLP Implementation

iQIYI Technical Product Team

Mar 22, 2019 · Artificial Intelligence

Experience Report of the 2018 iQIYI Multimodal Video Person Identification Challenge (WitcheR Team)

The WitcheR team won the 2018 iQIYI multimodal video person identification challenge by building a fast pipeline that combined a custom face‑and‑keypoint detector, ArcFace‑trained face embeddings, scene classification, and a three‑layer MLP with several training tricks, achieving a final mAP of 88.6 % and demonstrating the value of rapid idea validation and open‑sourced code for future challenges.

MLPModel FusionMultimodal

0 likes · 12 min read

Experience Report of the 2018 iQIYI Multimodal Video Person Identification Challenge (WitcheR Team)

Tencent Cloud Developer

Jun 25, 2018 · Artificial Intelligence

Using MLP for Image Classification: Implementation, Results, and Limitations

The article demonstrates how a simple fully‑connected MLP can be trained on a small 64×64×3 cat‑vs‑non‑cat dataset, achieving perfect training accuracy but only 78 % test accuracy, and explains that parameter explosion, vanishing gradients, and lack of spatial invariance limit MLPs, motivating the shift to CNNs.

H5pyMLPPython

0 likes · 15 min read

Using MLP for Image Classification: Implementation, Results, and Limitations

Hulu Beijing

Dec 28, 2017 · Artificial Intelligence

Designing MLPs for XOR and Any Boolean Function: Layers and Node Requirements

This article explains how to construct a multi‑layer perceptron that implements XOR and arbitrary n‑bit Boolean functions, detailing the hidden‑node count for single‑ and multi‑layer designs and the minimal number of network layers needed.

Boolean functionsMLPxor

0 likes · 6 min read

Designing MLPs for XOR and Any Boolean Function: Layers and Node Requirements