Overview of Recommendation and Search System Architecture: Recall and Ranking Techniques
This article explains the architecture of recommendation and search systems, detailing various recall methods such as collaborative filtering, matrix factorization, and vector‑based approaches, as well as ranking models like LR, FM, and DeepFM, and discusses re‑ranking and traffic control strategies.
Introduction
Recommendation and search systems are essential components of most Internet services, handling massive amounts of information through stages of recall, ranking, and re‑ranking to deliver a limited set of items that match user interests or explicit queries.
Recall in Recommendation Systems
Collaborative Filtering (CF) Recall
itemCF
Item‑based CF retrieves items similar to those a user has previously interacted with, using similarity measures derived from textual information, user behavior, or graph embeddings.
userCF
User‑based CF computes similarity between users, but due to the larger user space it is often less efficient than itemCF.
Matrix Factorization (MF) Recall
MF treats user‑item interactions as a rating matrix, factorizing it into low‑rank user and item matrices. Because the matrix is sparse, Alternating Least Squares (ALS) is used instead of SVD.
ALS projects the original rating matrix (U×I) onto two low‑rank matrices (U×K and K×I) and minimizes the squared error loss.
Vector‑Based Recall
The model ingests a sequence of interacted items, user attributes, and context features, embeds discrete features, applies avg‑pooling on the item sequence, concatenates with user embeddings, and feeds the result into a DNN that predicts click probability via softmax.
Recall in Search Systems
Search recall starts from product textual information and retrieves items related to the query after query processing (QP).
Query Processing (QP)
Query Rewriting
Normalization (e.g., trimming spaces, case conversion).
Spelling correction using HMM or edit distance.
Query expansion based on semantics or intent.
Query Intent Recognition
Category prediction via statistical or content‑based methods.
Named‑entity recognition for brand, product, model terms.
Term‑Based Recall
Uses inverted indexes to match query terms with documents; BM25 computes relevance scores.
Semantic Vector Recall
DSSM Deep Recall Model
Trains a DNN to embed queries and document titles into low‑dimensional vectors using click logs, then computes cosine similarity for retrieval.
BERT‑Based Retrieval
Similar to DSSM but uses a pre‑trained BERT encoder to obtain semantic vectors.
Multimodal Retrieval
Combines text (product title) and image features using a BERT‑based encoder; the three modalities ( ) are concatenated with a [SEP] token and fine‑tuned on click logs.
Ranking in Search & Recommendation Systems
CTR estimation models evolve from linear models (LR) to factorization machines (FM/FFM) and deep models (DNN, DeepFM), incorporating higher‑order feature interactions.
Logistic Regression (LR)
Simple linear model with fast inference but limited expressive power.
FM/FFM
Factorization machines introduce second‑order feature interactions, improving performance while still requiring feature engineering.
DeepFM
Combines FM for low‑order features and DNN for high‑order features, sharing the same input embeddings for efficient training.
Re‑ranking
After initial scoring, re‑ranking refines the final list to improve user experience and control traffic.
User Experience
Disperses items of the same category, shop, or similar images to avoid fatigue and encourage exploration.
Traffic Control
Two types: weight‑adjustment for short‑term business needs (e.g., holiday promotions) and volume‑preservation to support cold‑start or new items.
Weight‑adjustment: rule‑based score boosts or sample weighting in loss functions.
Volume‑preservation: allocate guaranteed exposure to new or under‑exposed items.
Conclusion
Both recommendation and search systems share a similar overall architecture consisting of recall, ranking, and re‑ranking layers; detailed algorithmic implementations will be covered in future articles.
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.