Master Deep Learning Foundations and 14 Cutting-Edge Recommendation Models

This article introduces core deep‑learning architectures—including MLP, RNN, CNN, auto‑encoders, and RBM—explains common activation and loss functions, and then surveys fourteen influential deep‑learning‑based recommendation algorithms such as FM, wide&deep, deepFM, NCF, GBDT+LR, seq2seq and YouTube DNN, complete with model diagrams and reference links.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Master Deep Learning Foundations and 14 Cutting-Edge Recommendation Models

Deep Learning Basics

We first introduce the most frequently used neural network structures and the typical methods applied during model training.

MLP Network

MLP (including DNN) is a forward‑propagation artificial neural network that maps an input vector to an output vector. The structure consists of an input layer, several hidden layers, and an output layer. Common activation functions are sigmoid, tanh, and ReLU.

RNN Network

RNN connects nodes in a directed cycle, allowing the network to use the output of the previous time step together with the current input. This enables modeling of sequential data.

CNN Network

CNN is a feed‑forward network that uses convolution operations to recognize patterns in continuous regions, especially effective for image processing. Its layers include input, convolution, pooling, fully‑connected, and output layers.

Auto‑Encoder (AE) Network

AE is an unsupervised network that aims to reconstruct the input as closely as possible. It consists of a two‑layer MLP; the reconstruction error is measured by mean‑square error or cross‑entropy, with L1 regularization for sparsity and optional input noise for robustness.

Restricted Boltzmann Machine (RBM)

RBM is an unsupervised two‑layer stochastic neural network with symmetric connections and no intra‑layer links. It serves as an effective feature extractor and can be stacked to form Deep Belief Networks (DBN).

Deep‑Learning and Traditional Model Fusion

Fusion can be loose (e.g., pre‑training embeddings then training an MLP) or tight (e.g., Wide&Deep where LR and MLP parameters are trained jointly). Loose coupling offers flexibility, while tight coupling yields end‑to‑end optimal parameters.

Common Loss Functions

Two widely used loss functions are cross‑entropy and mean‑square error.

Gradient Descent

Model parameters are optimized by minimizing the loss function using gradient descent. The algorithm iteratively updates parameters with a learning rate (e.g., 0.01) until convergence criteria—either a maximum number of iterations or a sufficiently small gradient norm—are met.

Deep Learning Recommendation Algorithms

We present fourteen influential deep‑learning‑based recommendation models drawn from leading conferences and journals.

FM Model

Factorization Machines (FM) extend linear regression by adding pairwise feature interactions represented as embeddings.

FNN Model

Field‑aware Neural Network (FNN) combines FM embeddings with an MLP, training the embedding layer first (as in FM) and then the MLP.

PNN Model

Product‑aware Neural Network (PNN) inserts a Product Layer between the FM and MLP parts to explicitly model feature‑wise cross interactions.

Wide&Deep Model

Wide&Deep merges a linear LR component (handling ID‑type feature crosses) with an MLP that processes dense embeddings of users and items, training both parts jointly.

DeepFM Model

DeepFM replaces the LR part of Wide&Deep with FM, automatically learning feature crosses while retaining the deep component.

NFM Model

Neural Factorization Machine (NFM) combines LR, FM, and an MLP; the Bi layer performs pairwise embedding interactions before feeding into the deep network.

AFM Model

Attentional Factorization Machine (AFM) adds an attention mechanism to NFM, weighting the pairwise interactions.

DSSM Model

Deep Structured Semantic Model (DSSM) uses two MLPs to embed queries and documents, measuring similarity with cosine distance.

MV‑DNN Model

Multi‑View DNN (MV‑DNN) assigns a separate MLP to each item, allowing item‑specific representations.

DCN Model

Deep & Cross Network (DCN) consists of an embedding layer, a deep MLP, and a cross network that explicitly performs feature‑wise cross operations.

NCF Model

Neural Collaborative Filtering (NCF) concatenates user and item embeddings and feeds them into an MLP.

GBDT+LR Model

Gradient Boosted Decision Trees (GBDT) generate leaf‑node features that are fed into a Logistic Regression model, effectively using GBDT as feature engineering.

Seq2Seq Model

Sequence‑to‑Sequence (seq2seq) employs an encoder‑decoder LSTM architecture to map an input sequence (e.g., user browsing history) to an output sequence, enabling sequential recommendation.

DNN for YouTube

The YouTube recommendation model uses an MLP to extract user features, multiplies them with video embeddings, and combines them with a deep network that also incorporates user history, search logs, location, and social attributes.

References

wide&deep: https://arxiv.org/pdf/1606.07792.pdf

seq2seq: https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf

PNN: https://arxiv.org/pdf/1611.00144.pdf

NFM: https://www.comp.nus.edu.sg/~xiangnan/papers/sigir17-nfm.pdf

NCF: https://www.comp.nus.edu.sg/~xiangnan/papers/ncf.pdf

MV‑DNN: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/frp1159-songA.pdf

GBDT+LR: http://quinonero.net/Publications/predicting-clicks-facebook.pdf

FNN: https://arxiv.org/pdf/1601.02376.pdf

FM: https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf

DNN‑YouTube: https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/45530.pdf

deepFM: https://www.ijcai.org/proceedings/2017/0239.pdf

DCN: https://arxiv.org/pdf/1708.05123.pdf

AFM: https://arxiv.org/pdf/1708.04617.pdf

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

machine learningAIDeep LearningNeural NetworksRecommendation Systems
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.