Beam Search Aware Training for Optimal Tree-Based Retrieval Models
This article presents a comprehensive study of tree-based deep models for large-scale matching, introduces the theoretical framework of optimal tree models, proposes a Beam Search aware training algorithm (BSAT/OTM) to address training-test mismatch, and demonstrates significant recall improvements on Amazon Books and UserBehavior datasets.
The recall stage is a critical component in internet search, recommendation, and advertising systems, determining overall service quality. Traditional approaches evolved from heuristic rules to vector retrieval, but both limited model capacity. In 2017, Alibaba's targeted advertising team introduced a new generation of deep tree matching (TDM) that enables arbitrary complex models to perform full‑library optimal retrieval.
TDM decomposes large‑scale matching into three modules—model, index, and retrieval—using a tree index and beam search to achieve near‑optimal recall. Subsequent versions (TDM 2.0 and BSAT) jointly optimize the tree structure and scoring model, addressing the training‑test distribution mismatch that degrades recall performance.
The paper formalizes the optimal tree model problem, proves existence for any probability distribution, and defines a loss function satisfying top‑k calibration. Because the arg‑top‑k operator is non‑differentiable, two approximations are introduced: estimating optimal node labels with the current model and fixing beam‑search results during partial optimization.
A practical training algorithm (OTM) is derived, combining the calibrated loss with weighted binary cross‑entropy over tree nodes. Experiments on the large public datasets Amazon Books and UserBehavior compare OTM with baselines such as Item‑CF, YouTube product‑DNN, hierarchical softmax (HSM), PLT, and the previous JTM method.
Results show OTM achieves the highest recall, improving over JTM by 29.8% on Amazon Books and 6.3% on UserBehavior. Ablation studies confirm that using beam‑search generated nodes for training, rather than heuristic node sets, drives the performance gains.
The study concludes that incorporating the retrieval component into model training substantially enhances recall quality, and outlines future directions such as end‑to‑end trainable retrieval modules and deeper joint optimization of model, index, and retrieval.
Additionally, the article includes a recruitment notice for Alibaba’s targeted advertising algorithm team, inviting interested candidates to contact [email protected].
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.