Optimizing Individual Diversity in Recommendation Systems: Architecture, Algorithms, and Practical Implementation
This article presents a comprehensive approach to improving recommendation system performance by optimizing individual diversity, detailing architectural layers, practical implementations of MMR and DPP algorithms, custom distance metrics, experimental results, and their impact on key business metrics such as click‑through and view rates.
Background: In recommendation systems, besides relevance, diversity is a crucial metric, but it often conflicts with relevance. This article explores how to balance diversity and relevance from a business perspective.
Challenges: (1) Vague optimization objectives for diversity; (2) Conflict between business metrics and diversity metrics.
Solution Overview: 58 Community's recommendation system introduces three layers of diversity optimization: recall layer, rule layer, and diversity layer, each enhancing data variety while maintaining relevance.
Recall Layer: Multi‑path recall with diversity‑aware strategies, increasing topic and category coverage by ~120% and ~100% respectively.
Rule Layer: Type‑wise bucketing and diversity control after coarse ranking, improving topic coverage by ~80% and category coverage by ~70%.
Diversity Layer: Global re‑ranking using MMR and DPP algorithms with custom distance measures to ensure diverse final results.
MMR Principle: Maximal Marginal Relevance selects items greedily to maximize a weighted combination of relevance and similarity penalty. Formula: Implementation flow is shown in the accompanying diagram.
DPP Principle: Determinantal Point Process models the probability of selecting a subset, favoring diverse items. The kernel matrix is built using custom distances; implementation follows incremental MAP inference with greedy optimization. The probability formula is illustrated as .
Custom Distance: Three types were evaluated – Jaccard/Hamming distance, tree‑model distance, and others. The tree‑model distance provided the best interpretability and business alignment.
Experimental Results: Both MMR and DPP improved key metrics. Compared with the original heuristic, MMR increased PVCTR by +3.4%, VVCTR by +5.4%, and avgPV by +4.2%; DPP achieved +5.8%, +7.9%, and +6.0% respectively. Latency for DPP with incremental updates was ~4 ms for 100 items.
Implementation Details: The system uses the EJML Java matrix library. Key code snippets include: exp(αr_u) α=θ/((2(1-θ))) S_ij=(1+⟨f_i,f_j⟩)/2
Conclusion and Future Work: Diversity optimization can boost business metrics while preserving relevance. Future directions include learning‑based diversity methods and reinforcement‑learning approaches.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.