Artificial Intelligence 21 min read

Path‑based Deep Network (PDN) for E‑commerce Recommendation Recall

This paper proposes a Path‑based Deep Network (PDN) that combines similarity‑index and embedding‑based retrieval paradigms to model user‑item interactions via Trigger Net and Similarity Net, achieving significant improvements in click‑through rate, GMV, and diversity on Taobao’s homepage feed.

DataFunTalk

Apr 29, 2021

Path‑based Deep Network (PDN) for E‑commerce Recommendation Recall

Abstract The common recall paradigms in recommendation systems—similarity‑index (I2I) and embedding‑based retrieval (EBR)—each have drawbacks: I2I struggles with sparse co‑occurrence and cannot model user‑to‑item (U2I) interactions, while EBR lacks fine‑grained item‑level modeling and diversity. To fuse their strengths, we introduce Path‑based Deep Network (PDN), which uses TriggerNet for U2I modeling and SimNet for I2I modeling, enabling end‑to‑end U2I2I learning. Deployed on Taobao’s homepage feed, PDN yields ~20% gains in clicks, GMV, and diversity and was accepted at SIGIR 2021.

Background Recommendation in Taobao aims to bridge users to items they like, with recall being the bottleneck of the four‑stage pipeline (recall → rough ranking → fine ranking → re‑ranking). Traditional industrial recall methods fall into two categories: item‑to‑item (I2I) indexing and embedding‑based retrieval (EBR). I2I excels at relevance but suffers from cold‑start and limited user modeling; EBR captures user preferences via vectors but cannot incorporate item co‑occurrence.

Motivation Existing methods either ignore user side‑info or item co‑occurrence, and multiple I2I indices coexist online, leading to fragmented similarity measures. A unified approach that leverages both user behavior and item relationships is needed.

Method Overview PDN constructs a two‑hop graph where the first hop models user interest in each interacted item and the second hop models similarity between that item and the target item. The overall score aggregates n+1 path weights (n two‑hop paths plus one direct bias path).

Embedding Layer Four feature groups are embedded: user info (z_u), item info (x_i), behavior info, and item‑item relevance. These embeddings are concatenated into dense vectors of dimensions d_u, d_i, d_a, d_c.

Trigger Net & Similarity Net TriggerNet takes user, behavior, and interacted‑item features to produce a per‑item interest score t_{uj}, forming a variable‑length user representation T_u = [t_{u1}, …, t_{un}]. SimNet computes item‑item similarity s_{ji} using item features and side information, yielding a variable‑length target‑item vector S_i = [s_{1i}, …, s_{ni}]. The two vectors are combined to obtain each two‑hop path weight.

Direct & Bias Net A shallow tower learns position and user bias; its output is added during training but removed at inference to keep the main model unbiased.

Loss Function The final relevance score is passed through a softplus and transformed to a click probability (1‑exp(‑score)). Binary cross‑entropy loss is used with click labels y_{u,i}.

Constrained Learning To avoid negative path weights that cause over‑fitting, the last layers of TriggerNet and SimNet use exp() activations, enforcing positive outputs and stabilizing training.

Online Deployment PDN is served via a greedy two‑stage path retrieval: (1) TriggerNet ranks top‑m interacted items per user; (2) SimNet’s pre‑computed similarity index retrieves top‑k items for each of those m items. This eliminates the need for separate bias nets during inference.

Index Generation Candidate item pairs are enumerated from session co‑occurrence and side‑info (e.g., same brand). SimNet scores each pair, and the top‑k per item are stored, compressing the N×N similarity matrix to N×k.

Experiments Offline evaluation on exposure‑click logs shows PDN (v43) outperforms Swing I2I, RankI2I, and earlier PDN versions in Hit‑Rate and Precision for both TOP‑3 and TOP‑8 recall. Online A/B tests confirm PDN replaces multiple index‑based recall components, reducing the share of tower‑based recall to 6% while boosting overall metrics. Additional analyses demonstrate PDN’s robustness across users with varying numbers of triggers and its superiority on public datasets (DSSM, YouTube‑DNN, BST, Item‑CF, SLIM, DIN).

Discussion & Future Work PDN can be viewed as an inner product of two sparse high‑dimensional vectors, offering high capacity without fixed dimensionality constraints. It also serves as a ranking model, improving coarse‑ranking AUC by 0.6% and approaching fine‑ranking performance.

Acknowledgments Thanks to mentors Deng Hongbo, Piao Xue, and Prof. Li Chenliang (Wuhan University), as well as the collaborating team.

References [1] Learning Deep Structured Semantic Models for Web Search using Clickthrough Data. [2] Deep Neural Networks for YouTube Recommendations. [3] Behavior Sequence Transformer for E‑commerce Recommendation in Alibaba. [4] Item‑Based Collaborative Filtering Recommendation Algorithms. [5] SLIM: Sparse Linear Methods for Top‑N Recommender Systems. [6] Deep Interest Network for Click‑Through Rate Prediction.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

E‑Commerce recommendation deep learning recall Embedding PDN click-through-rate

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.