Artificial Intelligence 14 min read

Overview of Recommendation Systems: Definitions, Architecture, Recall, Ranking, and Re‑ranking

This article provides a comprehensive overview of recommendation systems, covering their definition, basic framework, request flow, AB testing, recall strategies (both non‑personalized and personalized), collaborative‑filtering methods, vector‑based retrieval, wide‑and‑deep models, and the MMR re‑ranking algorithm with code examples.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Overview of Recommendation Systems: Definitions, Architecture, Recall, Ranking, and Re‑ranking

Recommendation systems aim to solve information overload by helping users discover items they are likely to be interested in, extracting latent user interests from behavior data.

The core pipeline consists of three stages: recall, ranking, and re‑ranking. A typical request flow starts with the front‑end sending user identifiers (e.g., pin or UUID) to back‑end services, which first perform traffic splitting based on AB‑test configurations, then invoke the appropriate recall, ranking, and re‑ranking modules. Multiple recall channels generate candidate items, which are subsequently ordered and fine‑tuned before being presented to the user.

AB testing (e.g., using hash(uuid+experimentId+timestamp)%100) is essential for evaluating model performance online, enabling rapid iteration and risk mitigation.

Recall strategies are divided into non‑personalized (e.g., hot items, new arrivals) and personalized approaches. Personalized recall includes tag‑based, region‑based, and collaborative‑filtering (CF) methods. CF can be user‑based, item‑based, or model‑based (latent‑semantic models). Item similarity is often computed using cosine similarity or Jaccard index, as illustrated by the following formulas:

Cosine similarity: cosθ = (x1*x2 + y1*y2) / (√(x1² + y1²) * √(x2² + y2²))

Jaccard similarity: J(A,B) = |A∩B| / |A∪B|

Vector‑based recall transforms users and items into low‑dimensional embeddings (e.g., word2vec) and performs nearest‑neighbor search in the vector space. The offline stage trains embeddings, while the online stage retrieves vectors for fast recall. Common vector search algorithms include Locality‑Sensitive Hashing (LSH), Hierarchical Navigable Small World graphs (HNSW), and Product Quantization.

Two popular model architectures for ranking are:

Dual‑tower model: separate user and item towers ingest respective features (user ID, demographics, click sequence; item ID, category, price, recent sales) and produce embeddings that are combined for scoring.

Wide & Deep model: the wide part captures memorization via linear models on raw and crossed features, while the deep part (embedding + multi‑layer perceptron) provides generalization for unseen feature combinations.

Re‑ranking (or post‑ranking) fine‑tunes the ordered list to satisfy business goals such as diversity, exposure control, or sensitive filtering. The Maximal Marginal Relevance (MMR) algorithm balances relevance and diversity, and its Python implementation is shown below:

def MMR(itemScoreDict, similarityMatrix, lambdaConstant=0.5, topN=20):
    s, r = [], list(itemScoreDict.keys())
    while len(r) > 0:
        score = 0
        selectOne = None
        for i in r:
            firstPart = itemScoreDict[i]
            secondPart = 0
            for j in s:
                sim2 = similarityMatrix[i][j]
                if sim2 > secondPart:
                    secondPart = sim2
            equationScore = lambdaConstant * (firstPart - (1 - lambdaConstant) * secondPart)
            if equationScore > score:
                score = equationScore
                selectOne = i
        if selectOne is None:
            selectOne = i
        r.remove(selectOne)
        s.append(selectOne)
    return (s, s[:topN])[topN > len(s)]

The algorithm selects items that are highly relevant to the user while being dissimilar to already chosen items, with a time complexity of O(n²) that can be reduced by limiting the candidate set.

Overall, the article introduces the full recommendation pipeline, key algorithms, and practical considerations for building scalable, high‑performance recommender systems.

machine learningRankingrecallcollaborative filteringRecommendation systemsVector Retrievalre-ranking
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.