How Graph Embedding Boosts Cross-Category Bundle Recommendations on E‑Commerce

This article explains how graph embedding techniques, including a BSP‑based distributed LINE implementation and a cross‑category probabilistic graph model, are applied to improve the diversity and relevance of bundle (凑单) recommendations during large‑scale shopping events.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Graph Embedding Boosts Cross-Category Bundle Recommendations on E‑Commerce

Background

In this year's bundle (凑单) scenario, the second‑page recommendation added a "锦囊" to increase product richness and personalized category tags, while supporting personalized recommendations for the Tmall "万券齐发" venue and mixed category subsidies. The focus remains on enhancing user exploration and shopping experience by improving recommendation diversity and cross‑category relevance. The article mainly discusses the Graph Embedding work, including parallel algorithm attempts and applications.

Algorithm

Problem Abstraction and Description

The basic user‑item purchase relationship on an e‑commerce platform can be modeled as a bipartite graph of users and items. Solid blue edges represent direct interactions (click, purchase), while dashed black edges capture item‑item relationships derived from shared user behaviors. If node attributes are considered, the bipartite graph becomes a more complex attributed graph.

To compute item‑to‑item (I2I) relationships, a common approach converts the bipartite graph into a homogeneous item graph using memory‑based collaborative filtering (e.g., Adamic‑Adar, Swing) or samples weighted random walks to generate co‑occurrence samples, then trains item embeddings with Skip‑Gram models (DeepWalk, Node2Vec, LINE). These embeddings are used for link prediction and classification.

In the bundle scenario, recommending items that the user has already added to the cart can be counter‑productive; therefore, the recommendation emphasizes cross‑category diversity. Building on last year's Graph Embedding deployment, this year we strengthened cross‑category training and attempted a BSP‑based distributed LINE implementation, designing a cross‑category probabilistic graph model.

Distributed LINE Implementation on BSP

SGNS (Skip‑Grams with Negative Sampling) is the classic Word2Vec model widely adopted in Graph Embedding. LINE combines first‑order and second‑order proximity to learn node vectors. The objective functions O1 (first‑order) and O2 (second‑order) are optimized with negative sampling:

In a BSP framework, the neighbor‑wise updates (first part) parallelize well, while global negative sampling (second part) is harder. Prior work introduced Target Negative Sampling to parallelize negative sampling across partitions.

Using the Odps‑Graph BSP framework, we implemented a distributed LINE algorithm where vertices store node vectors, and negative sampling is confined to each worker, with inter‑worker messages providing approximate global sampling. Gradient updates for positive and negative samples are performed in two separate super‑steps via vertex messaging. The pseudo‑code is illustrated below:

Cross‑Category Probabilistic Graph Model

Traditional Graph Embedding models treat any two nodes equally when computing similarity. For bundle recommendations, we need to emphasize cross‑category learning. Inspired by the RARE algorithm, we extend it by weakening similarity between items of different categories based on category distance, and we embed category vectors themselves.

The probabilistic graph model incorporates item embeddings, category embeddings, and cross‑category embeddings, trained via a MAP objective. The model captures that when a user interacts with two items, part of the similarity may stem from shared category attributes rather than pure item embeddings.

Practical Cases

From a modern decorative painting, the system recalls oil paintings, switch stickers, tableware, water kettles, wall stickers, etc., across categories.

From a trench coat, the system recalls facial cream, mascara, BB cream, earrings, dresses, and other cross‑category items.

Summary

Graph Embedding is a crucial branch of graph learning that represents nodes with vectors, enabling the capture of high‑order relationships beyond first and second order. It improves recommendation richness and novelty. Ongoing research in the company’s algorithm and system teams continues to deepen these techniques.

Outlook

Future work will incorporate more attribute features (product, user) into the graph, explore meta‑path based embeddings, and integrate embedding‑based I2I retrieval with ranking models. Improving the completeness of bundle entry points and reducing repetitive exposure will also be key research directions.

Project Summary

This year the bundle project upgraded both system and algorithm components, deploying deep learning models, group‑knapsack optimization, cross‑category graph models, and real‑time LTR for weight learning, resulting in notable increases in payment amount and conversion rates.

References

【1】Adamic, L. A., & Adar, E. (2003). Friends and neighbors on the web. Social Networks, 25 (3), 211‑230.

【2】Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). LINE: Large‑scale Information Network Embedding. In WWW .

【3】Gu, Y., Sun, Y., Li, Y., & Yang, Y. (2018). RaRE: Social Rank Regulated Large‑scale Network Embedding. WWW , 2018.

【4】Perozzi, B., Al‑Rfou, R., & Skiena, S. (2014). DeepWalk: Online learning of social representations. KDD , 2014.

【5】Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. KDD , 2016.

【6】Stergiou, S., Straznickas, Z., Wu, R., & Tsioutsiouliklis, K. (2017). Distributed Negative Sampling for Word Embeddings. AAAI .

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Recommendation Systemsgraph embeddinge‑commerceBSPcross‑categoryLINE algorithm
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.