How Alibaba’s Graph‑Based Bundle Mining Doubles Conversion in E‑Commerce
Alibaba’s latest bundle‑mining system leverages weighted graph embedding and real‑time sampling to recommend complementary products, replacing traditional item‑to‑item similarity, boosting click‑through rates by up to 13% offline and 4% online during the Double‑11 promotion while handling billions of edges.
Background
Bundle purchasing is a crucial step in the shopping‑coupon flow, helping users find items that meet a discount threshold (e.g., spend 400 and get 50 off). Full‑discount coupons are the most widely used promotion during large sales because they increase both user savings and average order value.
Alibaba’s recent redesign of the bundle feature introduced two major breakthroughs: the product page now supports search, price, category filters, and sorting; and the algorithm shifted from traditional item‑to‑item similarity to a graph‑based bundle mining approach that discovers multi‑hop purchase relationships.
Core Algorithm
1. Basic Idea
Graphs provide a high‑level abstraction for representing entities (nodes) and their relationships (edges). User purchase behavior naturally forms a graph where nodes are items and edges represent co‑purchase events, weighted by frequency, time, or amount.
Graph embedding learns low‑dimensional vectors for nodes, enabling efficient similarity computation. Techniques such as DeepWalk, LINE, and Node2Vec sample random walks on the graph and train embeddings with Skip‑Gram models.
2. Main Techniques
a) Graph Construction
A weighted item graph is built: vertices are products, edges connect items bought together, and edge weights reflect co‑purchase counts, timestamps, or monetary value. Weighted walks bias the sampling toward high‑frequency edges, avoiding the dominance of cold‑start items.
b) Sampling
Traditional random walks treat all neighbors equally, which is unsuitable for a product graph with millions of nodes and many low‑frequency edges. Weighted walk samples neighbors proportionally to edge weight, ensuring that popular items are more likely to appear in the walk and that multi‑hop relationships (e.g., A → C → D) are captured.
c) Embedding
The sequences generated by weighted walks are transformed into item‑item pairs and fed into a supervised DNN embedding model, overcoming the evaluation limitations of unsupervised methods. The model learns to predict the likelihood that two items should be bundled.
Implementation
Offline
Training : Historical 50‑day transaction data (≈30 M items, 200 M edges) are processed on the ODPS graph platform to perform weighted walks, producing 200 M item‑item pairs. A TensorFlow model on PAI trains for ~2 hours until convergence.
Prediction : Scores are computed for billions of candidate pairs across the entire catalog.
Online Serving : For each seed item, the top‑N bundled items are indexed in the search engine and retrieved during user queries.
Real‑time
Real‑time logs are streamed through Porsche, transformed into the same graph format, and fed to the ODPS graph platform for weighted walks. The resulting sequences are scored by the DNN model and the top bundles are written back to the recommendation engine within minutes.
Experiments and Results
Click‑through Rate
The offline bundle algorithm increased IPV by 13% compared to the baseline; the real‑time version added another 4% uplift.
Richness
Bundle mining raised average exposure of leaf categories by 88% and first‑level categories by 43%.
Summary
The weighted graph’s propagation ability captures multi‑hop purchase relationships that traditional item‑to‑item similarity misses, significantly improving coverage and conversion. Offline experiments show a clear AUC gain over statistical baselines, and the system can update up to 100 k edges per minute, supporting high‑traffic events like Double‑11.
Future Work
Planned enhancements include real‑time price preview for bundle progress, better seed‑item capture, and exploration of graph‑bandit methods to introduce novel yet relevant items while balancing exploration and exploitation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
