Artificial Intelligence 15 min read

How Collaborative Filtering Powers Recommendations: From Manhattan to Cosine Similarity

This article walks through the fundamentals of recommendation systems, explaining collaborative filtering and various similarity measures—including Manhattan, Euclidean, Minkowski, Pearson correlation, and cosine similarity—while discussing their suitability for dense, sparse, or biased rating data and introducing K‑Nearest Neighbors for practical implementation.

StarRing Big Data Open Lab

Oct 20, 2016

How Collaborative Filtering Powers Recommendations: From Manhattan to Cosine Similarity

We begin a data‑mining journey with recommendation systems, which appear everywhere such as Amazon’s “Customers who bought this also bought” section.

In the Amazon example two elements drive recommendations: the item you viewed and other users who also viewed that item.

How to Find Similar Users?

The first step is to locate users similar to the target user. Using a simple two‑dimensional model where users rate two books with five‑star scores, we illustrate similarity calculations.

Three users’ ratings for the two books are shown in the table below.

To recommend a book to a mysterious user X who gave four stars to one book and two stars to the other, we compute distances to other users.

Manhattan Distance

The Manhattan distance between two points (x1, y1) and (x2, y2) is the sum of absolute differences: |x1‑x2| + |y1‑y2|.

Using this metric, user Amy is the closest to X, so we recommend the book Amy rated highly.

Euclidean Distance

The Euclidean distance is the straight‑line distance between two points, computed via the Pythagorean theorem.

Applying this to the same users yields a distance of 4 between Amy and X.

Minkowski Distance

Both Manhattan and Euclidean are special cases of the Minkowski distance, where the parameter r determines the metric (r=1 for Manhattan, r=2 for Euclidean, r=∞ for Chebyshev).

Pearson Correlation Coefficient

Pearson correlation measures the linear relationship between two users’ rating vectors, ranging from –1 (perfect disagreement) to 1 (perfect agreement). It helps compare users whose rating scales differ.

An approximate formula allows single‑pass computation, useful for large datasets.

Cosine Similarity

Cosine similarity evaluates the angle between two rating vectors, handling sparse data well. It is computed as the dot product divided by the product of the vectors’ magnitudes.

Using the example of two perfectly aligned users yields a cosine similarity of 0.935, indicating a strong match.

K‑Nearest Neighbors

In collaborative filtering, the K‑Nearest Neighbors algorithm selects the K most similar users (based on Pearson, Manhattan, Euclidean, or cosine) to generate recommendations, weighting each neighbor by its similarity.