How Cosine Similarity Powers Movie Recommendations: A Python Guide
This tutorial explains various similarity metrics such as cosine similarity, Euclidean distance, Jaccard index, and Pearson correlation, demonstrates a Python function to compute user interest similarity, and shows how to generate movie recommendations with example code and output.
After converting a dataset, we can use similarity metrics to find movies similar to those a user has watched. Common pure and hybrid metrics include cosine similarity, Euclidean distance, Jaccard index, and Pearson correlation.
Cosine similarity
Euclidean distance
Jaccard index
Pearson correlation
4.1 Cosine Similarity
Cosine similarity (also called cosine similarity) evaluates similarity by calculating the cosine of the angle between two vectors, typically visualized in a 2‑dimensional space.
It is widely used in machine learning for measuring similarity between users or items. The mathematical formula is shown below.
The formula can be interpreted as the sum of the products of user A and user B's ratings for each movie, divided by the product of the square roots of the sum of squares of each user's ratings.
4.2 Pearson Correlation
Pearson correlation yields results very similar to cosine similarity; detailed explanations can be found on Wikipedia.
We now have a function that computes user interest similarity using cosine similarity, which forms the core of our recommendation system.
def cos_similarity(people,movie1,movie2):
si={}
for item in people[movie1]:
if item in people[movie2]:
si[item]=1
if len(si)==0:
return 0
sum1=0
sum21=0
sum22=0
for item in si:
sum1+=(people[movie1][item]*people[movie2][item])
sum21+=pow(people[movie1][item],2)
sum22+=pow(people[movie2][item],2)
if sum21==0 or sum22==0:
return 0
return round(sum1/(sqrt(sum21)*sqrt(sum22)),2)5 Output
First, we need a collection of movies that have been watched:
movies_watched=["You, Me and Dupree","Catch Me If You Can","Snitch"]The system learns from this data and outputs recommended movies with similarity scores, for example:
------------------------------
| You, Me and Dupree |
-------------------------------
Catch Me If You Can 0.97
Just My Luck 0.85
Lady in the Water 0.96
Snakes on a Plane 0.97
Snitch 1.0
Superman Returns 0.98
The Night Listener 0.96
------------------------------
| Catch Me If You Can |
------------------------------
Just My Luck 1.0
Lady in the Water 0.98
Snakes on a Plane 0.99
Snitch 1.0
Superman Returns 1.0
The Night Listener 0.92
You, Me and Dupree 0.97
------------------------------
| Snitch |
------------------------------
Catch Me If You Can 1.0
Just My Luck 1.0
Lady in the Water 0.91
Snakes on a Plane 0.99
Superman Returns 0.99
The Night Listener 0.88
You, Me and Dupree 1.0
------------------------------By setting a similarity threshold (e.g., 0.98), only movies exceeding the threshold are displayed, yielding a more concise recommendation list.
------------------------------
| You, Me and Dupree |
-------------------------------
Snitch 1.0
Superman Returns 0.98
------------------------------
| Catch Me If You Can |
------------------------------
Just My Luck 1.0
Lady in the Water 0.98
Snakes on a Plane 0.99
Snitch 1.0
Superman Returns 1.0
------------------------------
| Snitch |
------------------------------
Catch Me If You Can 1.0
Just My Luck 1.0
Snakes on a Plane 0.99
Superman Returns 0.99
You, Me and Dupree 1.0
------------------------------The complete code is available on GitHub at https://github.com/Mitko06/Recommender-System.
Conclusion
We have covered the fundamentals of building a recommendation system, focusing on similarity metrics such as cosine similarity, Euclidean distance, and Pearson correlation. While real‑world systems are more complex, these basics form the backbone of most recommender engines.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
