Artificial Intelligence 27 min read

How Alibaba’s Brand‑Level Ranking Boosts E‑Commerce Clicks with Attention‑GRU

This article presents Alibaba’s first brand‑level ranking system that personalizes product ordering by modeling user brand preferences with an enhanced Attention‑GRU, detailing feature engineering, model improvements, extensive offline experiments on a massive Tmall dataset, and a successful online A/B test that increased CTR, ATIP, and GMV.

Alibaba Cloud Developer

Jul 20, 2018

1.1 Introduction

In e‑commerce platforms such as Taobao, brand plays an increasingly important role in users' click and purchase decisions because users associate brands with product quality. Existing ranking systems do not target brand preference, leading to mixed brand results that force users to spend extra effort browsing.

We propose the first brand‑level ranking system that aggregates items of the same brand and orders brands according to individual user preferences.

1.2 Related Work

1.2.1 RNN, GRU and Attention‑GRU

RNNs have shown strong performance on sequential data. GRU mitigates gradient vanishing and is computationally efficient. Attention‑GRU incorporates an attention mechanism to weigh different behaviors.

1.2.2 RNN models for behavior modeling

Previous works focus on session‑based RNNs or basket recommendation; our approach differs by integrating brand‑level signals and time intervals.

1.3 Task Definition and Model Adaptation

Let \(U\) be the set of \(M\) users and \(B\) the set of \(N\) brands. For each user \(u\) we record a behavior sequence \(\{(b_m, a_m, t_m)\}_{m=1}^L\) where \(b_m\) is a brand, \(a_m\) is the action type (click or purchase) and \(t_m\) the timestamp. The goal is to predict the probability that user \(u\) will perform an action on brand \(b\) at a future time.

We adapt traditional RNN models by encoding brand features, action‑type one‑hot vectors, and time‑gap information, and by extending the GRU with attention and a time gate (Time‑Attention‑GRU).

1.4 Brand‑Level Ranking System

1.4.1 Feature Engineering

We design brand features based on a 7‑level price hierarchy and eight e‑commerce metrics per price level, yielding a 56‑dimensional vector for each brand.

1.4.2 Model Design

We build an Attention‑GRU model and introduce three key improvements:

Integrate heuristic brand features with learned brand embeddings.

Model different action types (click vs. purchase) with separate matrices to capture their interaction with brands.

Incorporate a time gate to model intervals between behaviors, forming the Time‑Attention‑GRU.

These improvements are illustrated in Figure 1.4.

1.5 Offline Experiments

1.5.1 Dataset

We collected a large Tmall dataset containing 3,591,372 users, 90,529 brands and 82,960,693 interactions. Each user’s sequence is split into short sequences of length 11 for training.

1.5.2 Baselines and Metrics

Baselines include GRU, vanilla Attention‑GRU, Time‑LSTM, Session‑RNN, and libFM. We evaluate using AUC and F1.

1.5.3 Results

Attention‑GRU‑3M outperforms all baselines, confirming the benefit of attention and the three improvements. Ablation studies show that removing any improvement degrades performance, especially improvement 1.

1.6 Online Experiments

We conducted a 7‑day A/B test on Tmall, comparing the original ranking with a version that adds a “Brand” button to switch to the brand‑level ranking. The new version achieved higher CTR, ATIP and a 3.51% increase in GMV.

1.7 Conclusion

We introduced a brand‑level ranking system that personalizes brand ordering by combining carefully engineered brand features with an enhanced Attention‑GRU (Attention‑GRU‑3M). Offline and online experiments demonstrate significant improvements in user engagement and revenue. Future work may extend the framework to triple‑wise relations among users, tags, and brands.

References

[1] D. Zipser, Y. Chen, F. Gong. The modernization of the Chinese consumer. McKinsey Quarterly, 2016. [2] Y. Zhu et al. What to do next: Modeling user behaviors by Time‑LSTM. IJCAI‑17, 2017. [3] J. K. Chorowski et al. Attention‑based models for speech recognition. NIPS, 2015. [4] K. Cho et al. Learning phrase representations using RNN encoder‑decoder for statistical machine translation. arXiv:1406.1078, 2014. [5] D. Bahdanau, K. Cho, Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473, 2014. [6] J. L. Elman. Finding structure in time. Cognitive Science, 1990. [7] A. Graves. Generating sequences with recurrent neural networks. arXiv:1308.0850, 2013. [8] V. Mnih et al. Recurrent models of visual attention. NIPS, 2014. [9] B. Hidasi et al. Session‑based recommendations with recurrent neural networks. ICLR, 2016. [10] Y. K. Tan et al. Improved recurrent neural networks for session‑based recommendations. RecSys, 2016. [11] B. Hidasi, M. Quadrana, A. Karatzoglou, D. Tikk. Parallel recurrent neural network architectures for feature‑rich session‑based recommendations. RecSys, 2016. [12] F. Yu et al. A dynamic recurrent model for next basket recommendation. SIGIR, 2016. [13] S. Hochreiter, J. Schmidhuber. Long short‑term memory. Neural Computation, 1997. [14] J. Duchi, E. Hazan, Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. JMLR, 2011. [15] J. Chung et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555, 2014. [16] R. Jozefowicz, W. Zaremba, I. Sutskever. An empirical exploration of recurrent network architectures. ICML, 2015. [17] J. O. Berger. Statistical decision theory and Bayesian analysis. Springer, 2013. [18] S. Rendle. Factorization machines with libFM. TIST, 2012. [19] R. Pan et al. One‑class collaborative filtering. ICDM, 2008. [20] X. Huang et al. Dynamic web log session identification with statistical language models. JASIST, 2004. [21] M. Kim, J. Leskovec. Nonparametric multi‑group membership model for dynamic networks. NIPS, 2013.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

personalization Deep Learning e-commerce recommendation attention GRU brand ranking

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.