From Zero to One: Building and Optimizing Dropdown Recommendation in Shopee Chatbot
The article details Shopee Chatbot’s end‑to‑end development of a dropdown recommendation feature, describing the retrieve‑then‑rank architecture with BM25 and vector recalls, multilingual pre‑training and distillation, DeepFM‑based ranking, experimental gains in CTR and conversion, deployment infrastructure, business impact, and future enhancements.
The article describes the end‑to‑end development of a dropdown recommendation (auto‑completion) feature for Shopee Chatbot, covering business background, system architecture, model design, experiments, deployment, and future directions.
Business background : As Shopee expands, the volume of customer service queries grows. The Chatbot team aims to combine AI‑driven Chatbot with human agents, and dropdown recommendation becomes a key function to help users express their intent faster.
Overall solution : A classic retrieve‑then‑rank pipeline is adopted. User input triggers a multi‑way recall (textual BM25‑based retrieval and vector‑based retrieval) to generate candidate suggestions, followed by a ranking model that outputs the top‑5 suggestions.
Candidate pool construction : Data sources include solution titles, intent‑labelled data, and large chat logs. After cleaning (length filtering, duplicate removal via edit distance or clustering) and manual market‑specific review, a high‑quality pool of suggestions is built.
Multi‑way recall : • Textual recall uses Elasticsearch BM25 scores weighted by solution CTR. • Vector recall employs dense embeddings to match semantically similar queries and suggestions, handling cross‑language cases (e.g., Chinese query retrieving English suggestion). Two encoder approaches are explored: a dual‑tower model and a pretrained multilingual language model (XLM).
Multilingual & multi‑task pretraining : XLM is further pre‑trained on large unlabeled chat logs (masked language modeling), then fine‑tuned on intent classification using weakly labelled click logs, and finally on the downstream intent task. Knowledge distillation compresses the large teacher model into a lightweight student (e.g., TextCNN) for online serving.
Ranking module : Besides the recall score, additional features (user, query, solution ID, statistics) are fed into a DeepFM model. Variants with different cross‑encoders (ALBERT, RE2, ESIM) are evaluated. Multi‑objective ranking (CTR + conversion‑rate) is explored using ESMM and ESMM+MMoE architectures.
Experiments : 1. Intent‑recognition task shows that continued pre‑training improves accuracy from 0.642 to 0.673, with only a 1% drop after distillation. 2. Recall task demonstrates vector recall (SentBERT or distilled TextCNN) outperforms BM25 (recall@5: 0.841 vs 0.834). 3. CTR prediction experiments indicate DeepFM+ESIM achieves the best NDCG@5 (0.793). 4. Multi‑goal ranking experiments show ESMM variants improve both CTR and CVR AUC compared to a pure CTR model.
System implementation : Offline pipelines periodically retrain the text encoder and ranking model using exposure and click logs. Online serving uses ONNX‑deployed models, Redis caching for frequent queries, Elasticsearch for text recall, and Faiss (HNSW index) for vector recall. The architecture supports easy A/B testing of different recall and ranking components.
Business impact : Launched in June 2021 and rolled out to all markets by September 2021. Results include a 1% increase in overall solution rate, 2% lift in CTR after multi‑way recall, and up to 6% CTR improvement from advanced ranking models.
Future work : Expand the candidate pool, improve multilingual recall, incorporate more user and context features for personalization, explore additional multi‑task learning methods, and address cold‑start challenges with knowledge‑base exploration.
Shopee Tech Team
How to innovate and solve technical challenges in diverse, complex overseas scenarios? The Shopee Tech Team will explore cutting‑edge technology concepts and applications with you.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.