Identify Top Merchants with Pandas: First‑Order Analysis Tutorial
This article walks through a real‑world Pandas solution that calculates each merchant's total orders and first‑order counts to reveal which merchants are most popular among new customers, using AI‑assisted code generation and step‑by‑step explanation.
Hello, I am a Python enthusiast.
1. Introduction
A community member asked how to determine which merchants are most popular among new customers, requiring the total number of orders per merchant and the number of first orders (the first purchase made by each customer), while excluding merchants that have received no orders.
2. Implementation
I used the ChatGLM model to generate a Python solution. The code loads the Excel file, sorts the data by customer ID and order timestamp, extracts each customer's first order, aggregates total orders and first‑order counts per merchant, and merges the results with merchant names.
import pandas as pd
# Load the Excel file
df = pd.read_excel("./data.xlsx")
# Sort by customer_id and order_timestamp to get the first order for each customer
df_sorted = df.sort_values(by=['customer_id', 'order_timestamp'])
# Get the first order for each customer
df_first_orders = df_sorted.drop_duplicates(subset='customer_id', keep='first')
# Aggregate total orders and total customers per merchant
merchant_stats = df.groupby('merchant_id').agg(
total_orders=pd.NamedAgg(column='id_x', aggfunc='count'),
total_customers=pd.NamedAgg(column='customer_id', aggfunc='nunique')
).reset_index()
# Aggregate first orders per merchant
first_order_stats = df_first_orders.groupby('merchant_id').agg(
first_orders=pd.NamedAgg(column='id_x', aggfunc='count')
).reset_index()
# Merge the two dataframes
result = pd.merge(merchant_stats, first_order_stats, on='merchant_id')
# Merge with merchant names
result = pd.merge(result, df[['id_y', 'name']].drop_duplicates(), left_on='merchant_id', right_on='id_y').drop('id_y', axis=1)
print(result.head())The script runs successfully on local data, producing the expected merchant statistics.
3. Conclusion
This article demonstrates a practical Pandas workflow for answering a common business‑analytics question, illustrating how AI can assist in generating functional Python code to solve real‑world data problems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
