Artificial Intelligence 14 min read

Deep Sparse Network (NON): A Novel Deep Neural Network Model for Recommendation Systems

This article introduces the Deep Sparse Network (NON), a new deep neural architecture for recommendation systems that combines field‑wise networks, across‑field interaction networks, and an operation‑fusion network, and demonstrates its superior performance through extensive experiments and ablation studies.

DataFunTalk
DataFunTalk
DataFunTalk
Deep Sparse Network (NON): A Novel Deep Neural Network Model for Recommendation Systems

The article presents the Deep Sparse Network (NON), a recently proposed deep neural network model for recommendation systems that was accepted at SIGIR 2020.

Background: Recommendation systems aim to predict user preferences for items; deep learning has become a key technique, with models such as DNN, Wide&Deep, DeepFM, xDeepFM, and AutoInt.

Related Work: Existing approaches are categorized into content‑based, collaborative filtering, hybrid, and model‑based methods, each using various interaction operations (LR, FM, DNN, etc.).

NON Model Overview: NON consists of three hierarchical components:

Field‑wise Network: A dedicated DNN for each feature field (categorical features are embedded, numeric features are fed directly). A gate function (e.g., concatenation or element‑wise product) combines the field output with its input.

Across‑Field Network: Learns interactions between fields using a set of operations (LR, DNN, FM, Bi‑Interaction, multi‑head self‑attention, etc.) that are treated as hyper‑parameters and selected automatically based on validation performance.

Operation‑Fusion Network: Concatenates the outputs of the across‑field operations and feeds them into another DNN to capture high‑order nonlinear feature interactions.

To mitigate gradient vanishing in this deep architecture, auxiliary losses are added to each DNN layer, inspired by GoogLeNet.

Experimental Results:

Auxiliary losses accelerate training by 1.67× on the Criteo dataset while achieving comparable AUC.

Ablation studies show progressive performance gains when adding Field‑wise Network, Across‑Field Network, and the full NON model.

Field‑wise Network improves intra‑field embedding similarity (cosine similarity increases by 1–2 orders of magnitude) and makes inter‑field embeddings more distinguishable.

Operation studies reveal that no single operation set dominates across all datasets; data‑driven selection of operations is essential.

Compared with SOTA models (FFM, DNN, Wide&Deep, NFM, xDeepFM, AutoInt), NON consistently achieves the highest AUC improvements (0.64%–0.99%).

The article concludes that NON’s hierarchical design effectively captures both intra‑field and inter‑field information, leading to superior recommendation performance.

Images illustrating model architecture, field‑wise networks, across‑field networks, and experimental plots are included throughout the original content.

machine learningrecommendationDeep LearningCTR predictionfeature interactionsparse network
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.