Build Smart Product Recommendations with Python’s Apriori Algorithm
This article explains how intelligent recommendation differs from generic marketing, introduces association‑rule concepts such as support, confidence, and lift, and provides a step‑by‑step Python implementation using the Apriori algorithm to generate and interpret market‑basket recommendations.
Intelligent recommendation differs from generic marketing by focusing on customer needs, e.g., "you may also like" in e‑commerce.
The article is divided into two parts: a detailed principle explanation and a Python code practice.
Common recommendation system categories
Based on application domain: e‑commerce, social friend recommendation, etc.
Based on design idea: collaborative filtering, etc.
Based on usage data: user‑tag based recommendation, etc.
The focus is on classic association‑rule based market‑basket recommendation, describing support, confidence, and lift.
Association rule concepts
Support = rule transactions / total transactions; confidence = rule transactions / antecedent transactions; lift = confidence / unconditional probability. Lift > 1 indicates a useful recommendation.
Python implementation with Apriori
Data is loaded with pandas; columns include OrderNumber (customer), LineNumber (purchase order), and Model (product).
import pandas as pd
import numpy as np
df = pd.read_csv('bike_data.csv', encoding='gbk')
df.info(); df.head()Exploratory analysis shows the number of unique products, top‑selling items, and simple visualizations.
print(f"Data contains {df['Model'].nunique()} products")
# display product names in groups of 5
for i in range(0, len(model_names), 5):
print(model_names[i:i+5])Shopping baskets are created using the Apriori library:
import Apriori as apri
baskets = apri.dataconvert(arulesdata=df, tidvar='OrderNumber',
itemvar='Model', data_type='inverted')Association rules are mined with configurable thresholds (minSupport, minConf, minlen, maxlen). The resulting DataFrame contains lhs, rhs, support, confidence, lift.
Complementary products are selected by lift > 1, exclusive products by lift < 1:
hubu = result[result['lift'] > 1].sort_values(by='lift', ascending=False).head(20)
huchi = result[result['lift'] < 1].sort_values(by='lift').head(20)Business interpretation: strong lift indicates a high‑impact cross‑sell; weak lift suggests testing placement; lift < 1 signals products that should not be displayed together.
Using rules for recommendation
Depending on the goal—maximizing marketing response (high confidence) or maximizing sales uplift (high lift)—different rule subsets are chosen to recommend items after a purchase.
Conclusion: Apriori‑based association rules provide an easy entry point for intelligent recommendation, but large‑scale scenarios usually require hybrid methods or distributed frameworks such as Spark.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
