Collaborative Filtering Recommendation Systems: Evaluation Metrics, User‑Based and Item‑Based CF with Python Implementations
This article reviews recommendation system evaluation metrics such as precision, recall, coverage and novelty, explains the principles of user‑based and item‑based collaborative filtering, provides complete Python code for each method, and compares their characteristics and suitable application scenarios.
Evaluation Metrics for Recommendation Systems
To assess the quality of recommendation algorithms, several metrics are used:
Precision – the proportion of recommended items that are correct.
Recall – the proportion of all relevant items that are recommended.
Coverage – the fraction of the whole item space that appears in recommendations.
Novelty – the average popularity of recommended items, indicating how well long‑tail items are covered.
Code implementations of these metrics:
#train is the training set, test is the validation set, recommend N items per user
def RecallAndPrecision(self, train=None, test=None, K=3, N=10):
train = train or self.train
test = test or self.test
hit = 0
recall = 0
precision = 0
for user in train.keys():
tu = test.get(user, {})
rank = self.Recommend(user, K=K, N=N)
for i, _ in rank.items():
if i in tu:
hit += 1
recall += len(tu)
precision += N
recall = hit / (recall * 1.0)
precision = hit / (precision * 1.0)
return (recall, precision)
# Coverage
def Coverage(self, train=None, test=None, K=3, N=10):
train = train or self.train
recommend_items = set()
all_items = set()
for user, items in train.items():
for i in items.keys():
all_items.add(i)
rank = self.Recommend(user, K)
for i, _ in rank.items():
recommend_items.add(i)
return len(recommend_items) / (len(all_items) * 1.0)
# Novelty
def Popularity(self, train=None, test=None, K=3, N=10):
train = train or self.train
item_popularity = dict()
for user, items in train.items():
for i in items.keys():
item_popularity.setdefault(i, 0)
item_popularity[i] += 1
ret = 0 # novelty result
n = 0 # total recommended items
for user in train.keys():
rank = self.Recommend(user, K=K, N=N)
for item, _ in rank.items():
ret += math.log(1 + item_popularity[item])
n += 1
ret /= n * 1.0
return retUser‑Based Collaborative Filtering
The core idea is to find users with similar interests to the target user and recommend items liked by those similar users, typically using cosine similarity.
Algorithm steps:
Build an item‑to‑user inverted index.
Construct the co‑occurrence matrix C[u][v] counting common items between users u and v.
Compute the similarity matrix W[u][v] = C[u][v] / sqrt(N[u] * N[v]).
Estimate a target user's interest in an item i by aggregating scores from the K most similar users.
class UserBasedCF:
def __init__(self, train_file, test_file):
self.train_file = train_file
self.test_file = test_file
self.readData()
def readData(self):
self.train = dict()
for line in open(self.train_file):
user, item, score, _ = line.strip().split("\t")
self.train.setdefault(user, {})
self.train[user][item] = int(score)
self.test = dict()
for line in open(self.test_file):
user, item, score, _ = line.strip().split("\t")
self.test.setdefault(user, {})
self.test[user][item] = int(score)
def UserSimilarity(self):
self.item_users = dict()
for user, items in self.train.items():
for i in items.keys():
self.item_users.setdefault(i, set()).add(user)
C = dict()
N = dict()
for i, users in self.item_users.items():
for u in users:
N.setdefault(u, 0)
N[u] += 1
C.setdefault(u, {})
for v in users:
if u == v:
continue
C[u].setdefault(v, 0)
C[u][v] += 1
self.W = dict()
for u, related in C.items():
self.W.setdefault(u, {})
for v, cuv in related.items():
self.W[u][v] = cuv / math.sqrt(N[u] * N[v])
return self.W
def Recommend(self, user, K=3, N=10):
rank = dict()
action_item = self.train[user].keys()
for v, wuv in sorted(self.W[user].items(), key=lambda x: x[1], reverse=True)[:K]:
for i, rvi in self.train[v].items():
if i in action_item:
continue
rank.setdefault(i, 0)
rank[i] += wuv * rvi
return dict(sorted(rank.items(), key=lambda x: x[1], reverse=True)[:N])Item‑Based Collaborative Filtering
Item‑Based CF recommends items similar to those the user has already liked. Similarity between items i and j is computed from co‑occurrence across users.
Algorithm steps:
Build a user‑to‑item inverted index.
Construct the item co‑occurrence matrix C[i][j].
Compute the similarity matrix W[i][j] = C[i][j] / sqrt(N[i] * N[j]).
Estimate a user's interest in an unseen item j by aggregating similarity scores from items the user has interacted with.
class ItemBasedCF:
def __init__(self, train_file, test_file):
self.train_file = train_file
self.test_file = test_file
self.readData()
def readData(self):
self.train = dict()
for line in open(self.train_file):
user, item, score, _ = line.strip().split("\t")
self.train.setdefault(user, {})
self.train[user][item] = int(score)
self.test = dict()
for line in open(self.test_file):
user, item, score, _ = line.strip().split("\t")
self.test.setdefault(user, {})
self.test[user][item] = int(score)
def ItemSimilarity(self):
C = dict()
N = dict()
for user, items in self.train.items():
for i in items.keys():
N.setdefault(i, 0)
N[i] += 1
C.setdefault(i, {})
for j in items.keys():
if i == j:
continue
C[i].setdefault(j, 0)
C[i][j] += 1
self.W = dict()
for i, related in C.items():
self.W.setdefault(i, {})
for j, cij in related.items():
self.W[i][j] = cij / (math.sqrt(N[i] * N[j]))
return self.W
def Recommend(self, user, K=3, N=10):
rank = dict()
action_item = self.train[user]
for item, score in action_item.items():
for j, wj in sorted(self.W[item].items(), key=lambda x: x[1], reverse=True)[:K]:
if j in action_item:
continue
rank.setdefault(j, 0)
rank[j] += score * wj
return dict(sorted(rank.items(), key=lambda x: x[1], reverse=True)[:N])Differences and Applications of UserCF vs. ItemCF
UserCF works well when the number of users is moderate, the recommendation scenario requires timely personalization, and new items are abundant; however it struggles with new users and offers limited explainability.
ItemCF excels when the item catalog is smaller than the user base, long‑tail items are important, and explanations based on item similarity are desired; it is less suited for rapidly changing item pools.
Typical use cases: UserCF for news or real‑time feeds; ItemCF for e‑commerce, video platforms, and book recommendation where item turnover is slower.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
