Sliding Spectrum Decomposition (SSD) for Diversified Recommendation in Re‑ranking
This article reviews the Sliding Spectrum Decomposition (SSD) model presented by Xiaohongshu at KDD 2021, explaining how it incorporates sliding‑window diversity into the re‑ranking stage, combines content‑based and collaborative‑filtering embeddings via the CB2CF framework, and demonstrates its effectiveness through offline and online A/B experiments.
Introduction
The paper "Sliding Spectrum Decomposition for Diversified Recommendation" proposes the SSD model for the re‑ranking stage of feed‑flow recommendation, aiming to balance recommendation quality and diversity by considering both the current sliding window and items outside the window.
Challenges and Innovations
Two main shortcomings of existing methods are identified: (1) traditional DPP models only account for diversity within the current window, ignoring historical content, and (2) embedding vectors derived from sparse collaborative‑filtering data perform poorly for long‑tail items. SSD addresses the first issue by treating the recommendation sequence as a time‑series and applying sliding‑window analysis, while CB2CF tackles the second by learning item embeddings that combine content‑based (CB) and collaborative‑filtering (CF) signals.
SSD Model
SSD formulates the recommendation problem as selecting items that jointly maximize quality and diversity. It constructs a trajectory tensor from the sliding window, applies Singular Spectrum Analysis (SSA) to decompose it, and uses the singular values as orthogonal components representing diversity. The overall objective combines a quality score (e.g., CTR) with a diversity term weighted by a hyper‑parameter.
Greedy Inference
Because exact maximization is NP‑hard, the authors design an efficient greedy inference algorithm that updates orthogonalization results incrementally using a modified Gram‑Schmidt process and a circular queue to handle sliding windows with O(k) time and O(k) space complexity.
CB2CF Framework
CB2CF learns item embeddings by training a siamese network that fuses textual features (via BERT) and visual features (via Inception‑V3). Positive samples are constructed from user‑interacted seed items and their high‑exposure candidates, while negative samples are random items from ItemCF recommendations. Cosine similarity is used to normalize embeddings, and an extra dimension of value 1 is added to align cosine distance with volume‑based diversity measures.
Experiments
Offline experiments compare CF and CB2CF embeddings on long‑tail items, showing superior similarity retrieval for CB2CF. Online A/B tests on the re‑ranking stage compare DPP (control) with SSD/SSD* (treatment) and report significant improvements in both diversity metrics (ILAD, MRT) and quality metrics.
Conclusion
SSD introduces a novel way to incorporate sliding‑window diversity into recommendation re‑ranking, and CB2CF provides robust embeddings for long‑tail items; together they achieve higher diversity and quality than traditional DPP‑based methods.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.