How Alibaba’s Graph‑Based Bundle Mining Doubles Conversion in E‑Commerce

Alibaba’s latest bundle‑mining system leverages weighted graph embedding and real‑time sampling to recommend complementary products, replacing traditional item‑to‑item similarity, boosting click‑through rates by up to 13% offline and 4% online during the Double‑11 promotion while handling billions of edges.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba’s Graph‑Based Bundle Mining Doubles Conversion in E‑Commerce

Background

Bundle purchasing is a crucial step in the shopping‑coupon flow, helping users find items that meet a discount threshold (e.g., spend 400 and get 50 off). Full‑discount coupons are the most widely used promotion during large sales because they increase both user savings and average order value.

Alibaba’s recent redesign of the bundle feature introduced two major breakthroughs: the product page now supports search, price, category filters, and sorting; and the algorithm shifted from traditional item‑to‑item similarity to a graph‑based bundle mining approach that discovers multi‑hop purchase relationships.

Bundle mining overview
Bundle mining overview

Core Algorithm

1. Basic Idea

Graphs provide a high‑level abstraction for representing entities (nodes) and their relationships (edges). User purchase behavior naturally forms a graph where nodes are items and edges represent co‑purchase events, weighted by frequency, time, or amount.

Graph embedding learns low‑dimensional vectors for nodes, enabling efficient similarity computation. Techniques such as DeepWalk, LINE, and Node2Vec sample random walks on the graph and train embeddings with Skip‑Gram models.

2. Main Techniques

a) Graph Construction

A weighted item graph is built: vertices are products, edges connect items bought together, and edge weights reflect co‑purchase counts, timestamps, or monetary value. Weighted walks bias the sampling toward high‑frequency edges, avoiding the dominance of cold‑start items.

Algorithm framework
Algorithm framework

b) Sampling

Traditional random walks treat all neighbors equally, which is unsuitable for a product graph with millions of nodes and many low‑frequency edges. Weighted walk samples neighbors proportionally to edge weight, ensuring that popular items are more likely to appear in the walk and that multi‑hop relationships (e.g., A → C → D) are captured.

Weighted product graph
Weighted product graph

c) Embedding

The sequences generated by weighted walks are transformed into item‑item pairs and fed into a supervised DNN embedding model, overcoming the evaluation limitations of unsupervised methods. The model learns to predict the likelihood that two items should be bundled.

Supervised embedding model
Supervised embedding model

Implementation

Offline

Training : Historical 50‑day transaction data (≈30 M items, 200 M edges) are processed on the ODPS graph platform to perform weighted walks, producing 200 M item‑item pairs. A TensorFlow model on PAI trains for ~2 hours until convergence.

Prediction : Scores are computed for billions of candidate pairs across the entire catalog.

Online Serving : For each seed item, the top‑N bundled items are indexed in the search engine and retrieved during user queries.

Real‑time

Real‑time logs are streamed through Porsche, transformed into the same graph format, and fed to the ODPS graph platform for weighted walks. The resulting sequences are scored by the DNN model and the top bundles are written back to the recommendation engine within minutes.

Experiments and Results

Click‑through Rate

The offline bundle algorithm increased IPV by 13% compared to the baseline; the real‑time version added another 4% uplift.

CTR improvement
CTR improvement

Richness

Bundle mining raised average exposure of leaf categories by 88% and first‑level categories by 43%.

Richness improvement
Richness improvement

Summary

The weighted graph’s propagation ability captures multi‑hop purchase relationships that traditional item‑to‑item similarity misses, significantly improving coverage and conversion. Offline experiments show a clear AUC gain over statistical baselines, and the system can update up to 100 k edges per minute, supporting high‑traffic events like Double‑11.

Future Work

Planned enhancements include real‑time price preview for bundle progress, better seed‑item capture, and exploration of graph‑bandit methods to introduce novel yet relevant items while balancing exploration and exploitation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

e‑commerceReal-time Processingrecommendation systemgraph embeddingbundle miningweighted walk
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.