Master Deep Learning Foundations and 14 Cutting-Edge Recommendation Models
This article introduces core deep‑learning architectures—including MLP, RNN, CNN, auto‑encoders, and RBM—explains common activation and loss functions, and then surveys fourteen influential deep‑learning‑based recommendation algorithms such as FM, wide&deep, deepFM, NCF, GBDT+LR, seq2seq and YouTube DNN, complete with model diagrams and reference links.
Deep Learning Basics
We first introduce the most frequently used neural network structures and the typical methods applied during model training.
MLP Network
MLP (including DNN) is a forward‑propagation artificial neural network that maps an input vector to an output vector. The structure consists of an input layer, several hidden layers, and an output layer. Common activation functions are sigmoid, tanh, and ReLU.
RNN Network
RNN connects nodes in a directed cycle, allowing the network to use the output of the previous time step together with the current input. This enables modeling of sequential data.
CNN Network
CNN is a feed‑forward network that uses convolution operations to recognize patterns in continuous regions, especially effective for image processing. Its layers include input, convolution, pooling, fully‑connected, and output layers.
Auto‑Encoder (AE) Network
AE is an unsupervised network that aims to reconstruct the input as closely as possible. It consists of a two‑layer MLP; the reconstruction error is measured by mean‑square error or cross‑entropy, with L1 regularization for sparsity and optional input noise for robustness.
Restricted Boltzmann Machine (RBM)
RBM is an unsupervised two‑layer stochastic neural network with symmetric connections and no intra‑layer links. It serves as an effective feature extractor and can be stacked to form Deep Belief Networks (DBN).
Deep‑Learning and Traditional Model Fusion
Fusion can be loose (e.g., pre‑training embeddings then training an MLP) or tight (e.g., Wide&Deep where LR and MLP parameters are trained jointly). Loose coupling offers flexibility, while tight coupling yields end‑to‑end optimal parameters.
Common Loss Functions
Two widely used loss functions are cross‑entropy and mean‑square error.
Gradient Descent
Model parameters are optimized by minimizing the loss function using gradient descent. The algorithm iteratively updates parameters with a learning rate (e.g., 0.01) until convergence criteria—either a maximum number of iterations or a sufficiently small gradient norm—are met.
Deep Learning Recommendation Algorithms
We present fourteen influential deep‑learning‑based recommendation models drawn from leading conferences and journals.
FM Model
Factorization Machines (FM) extend linear regression by adding pairwise feature interactions represented as embeddings.
FNN Model
Field‑aware Neural Network (FNN) combines FM embeddings with an MLP, training the embedding layer first (as in FM) and then the MLP.
PNN Model
Product‑aware Neural Network (PNN) inserts a Product Layer between the FM and MLP parts to explicitly model feature‑wise cross interactions.
Wide&Deep Model
Wide&Deep merges a linear LR component (handling ID‑type feature crosses) with an MLP that processes dense embeddings of users and items, training both parts jointly.
DeepFM Model
DeepFM replaces the LR part of Wide&Deep with FM, automatically learning feature crosses while retaining the deep component.
NFM Model
Neural Factorization Machine (NFM) combines LR, FM, and an MLP; the Bi layer performs pairwise embedding interactions before feeding into the deep network.
AFM Model
Attentional Factorization Machine (AFM) adds an attention mechanism to NFM, weighting the pairwise interactions.
DSSM Model
Deep Structured Semantic Model (DSSM) uses two MLPs to embed queries and documents, measuring similarity with cosine distance.
MV‑DNN Model
Multi‑View DNN (MV‑DNN) assigns a separate MLP to each item, allowing item‑specific representations.
DCN Model
Deep & Cross Network (DCN) consists of an embedding layer, a deep MLP, and a cross network that explicitly performs feature‑wise cross operations.
NCF Model
Neural Collaborative Filtering (NCF) concatenates user and item embeddings and feeds them into an MLP.
GBDT+LR Model
Gradient Boosted Decision Trees (GBDT) generate leaf‑node features that are fed into a Logistic Regression model, effectively using GBDT as feature engineering.
Seq2Seq Model
Sequence‑to‑Sequence (seq2seq) employs an encoder‑decoder LSTM architecture to map an input sequence (e.g., user browsing history) to an output sequence, enabling sequential recommendation.
DNN for YouTube
The YouTube recommendation model uses an MLP to extract user features, multiplies them with video embeddings, and combines them with a deep network that also incorporates user history, search logs, location, and social attributes.
References
wide&deep: https://arxiv.org/pdf/1606.07792.pdf
seq2seq: https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf
PNN: https://arxiv.org/pdf/1611.00144.pdf
NFM: https://www.comp.nus.edu.sg/~xiangnan/papers/sigir17-nfm.pdf
NCF: https://www.comp.nus.edu.sg/~xiangnan/papers/ncf.pdf
MV‑DNN: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/frp1159-songA.pdf
GBDT+LR: http://quinonero.net/Publications/predicting-clicks-facebook.pdf
FNN: https://arxiv.org/pdf/1601.02376.pdf
FM: https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf
DNN‑YouTube: https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/45530.pdf
deepFM: https://www.ijcai.org/proceedings/2017/0239.pdf
DCN: https://arxiv.org/pdf/1708.05123.pdf
AFM: https://arxiv.org/pdf/1708.04617.pdf
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
