Joint Optimization of Tree‑based Index and Deep Model (JTM) for Large‑Scale Recommendation

This article presents JTM, a joint optimization framework that simultaneously learns a tree‑based index and a deep scoring model to overcome the limitations of traditional recommendation pipelines, demonstrating significant recall improvements on Amazon Books and Alibaba UserBehavior datasets through hierarchical user interest modeling and efficient tree learning.

DataFunTalk
DataFunTalk
DataFunTalk
Joint Optimization of Tree‑based Index and Deep Model (JTM) for Large‑Scale Recommendation

Search, recommendation, and advertising are core services for internet content providers, and on Alibaba's e‑commerce platform they are equally critical; the paper introduces a new generation of recommendation technology that unifies model, index, and retrieval optimization.

Background : Recommendation, search, and advertising share similar technical components; search can be viewed as recommendation with query constraints, and advertising adds price constraints. Advances in recommendation algorithms, from Item‑CF to vector retrieval, have reached a ceiling due to limited expressiveness of inner‑product models.

The authors previously proposed Tree‑based Deep Model (TDM), which achieved notable gains on large‑scale tasks and was accepted at NeurIPS 2019. However, existing systems often treat model, index, and retrieval as independent, leading to sub‑optimal performance.

Problems in Existing Systems : In large‑scale scenarios, recommendation pipelines consist of three components—model, index, and retrieval algorithm. Item‑CF uses a handcrafted inverted index with fixed scoring; vector retrieval relies on inner‑product similarity, limiting the use of more expressive scoring models; TDM’s joint optimization of model and tree structure still suffers from mismatched objectives between model learning and tree construction.

To address these issues, the paper proposes JTM (Joint Optimization of Tree‑based Index and Deep Model), which jointly optimizes the deep scoring model θ and the tree projection function π under a unified loss, alternating between gradient‑based model updates and combinatorial tree optimization.

3.1 TDM Overview : TDM treats recommendation as a hierarchical retrieval problem on a tree index, where each node’s preference probability p (u, n) is modeled by a deep network. Users traverse the tree via a top‑k beam search, achieving O (log N) retrieval complexity and allowing arbitrary deep scoring models.

3.2 Joint Optimization Framework : The loss function combines model parameters θ and tree structure π. The authors formulate the joint objective as minimizing a global empirical loss over positive samples, then solve it by alternating optimization: gradient descent for θ and a combinatorial max‑matching problem for π, approximated with a segment‑wise tree learning algorithm.

3.3 Hierarchical User Interest Representation : User behavior items are projected to their ancestor nodes at each tree level, producing level‑specific embeddings that feed the scoring model. This reduces noise from sharing a single embedding across levels and captures user preferences at appropriate granularities.

4 Experiments : The authors evaluate JTM, JTM‑J (joint optimization without hierarchical features), JTM‑H (hierarchical features with fixed tree), TDM, Item‑CF, YouTube‑product‑DNN, HSM, and a full‑scoring DNN baseline on Amazon Books and Alibaba UserBehavior datasets. Metrics include Precision, Recall, and F‑measure.

Results show that JTM outperforms all baselines, achieving 45.3% and 8.1% relative recall improvements over the best baseline DNN on the two datasets, respectively. Hierarchical modeling (JTM‑J) alleviates data sparsity, while joint tree learning further boosts performance, demonstrating a synergistic 1 + 1 > 2 effect.

4.3 Tree Structure Convergence : Visualizations reveal that JTM’s tree learning converges to a more stable and effective structure compared to clustering‑based tree construction, which tends to overfit in later iterations.

5 Conclusion : JTM provides a unified, data‑driven framework that jointly optimizes deep scoring models and tree indices, enabling efficient, high‑accuracy large‑scale recommendation and representing a significant technical advancement for search, recommendation, and advertising systems.

Header Image
Header Image
System Diagram
System Diagram
TDM Illustration
TDM Illustration
Tree Index
Tree Index
Optimization Objective
Optimization Objective
Experimental Results
Experimental Results
Tree Learning Comparison
Tree Learning Comparison
Recommendationdeep learninglarge scalejoint optimizationtree-based indexing
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.