Fundamentals 8 min read

Using Python Pandas for Excel‑Like Data Processing: 14 Common Operations

This article demonstrates how to replace typical Excel tasks such as VLOOKUP, pivot tables, duplicate removal, missing‑value handling, multi‑condition filtering, and grouping with concise Python pandas code, providing step‑by‑step examples and practical tips for efficient data analysis.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Using Python Pandas for Excel‑Like Data Processing: 14 Common Operations

The author, motivated by learning Python, shows how to handle Excel sales data entirely with pandas to strengthen Python skills and data‑processing capability.

1. VLOOKUP equivalent – Split the original table into two DataFrames and use df_c=pd.merge(df1,df2,on="订单明细号",how="left") to bring the profit column from the second table into the first.

2. Pivot table – Summarise profit by region and salesperson with pd.pivot_table(sale, index="地区名称", columns="业务员名称", values="利润", aggfunc=[np.sum, np.mean]) .

3. Compare two columns – Create a duplicate column, modify the first ten rows, and find mismatches using result=sale.loc[sale["订单明细号"].isin(sale["订单明细号2"])==False] .

4. Remove duplicates – Drop duplicate salesperson codes with sale.drop_duplicates("业务员编码", inplace=True) .

5. Missing‑value handling – Detect missing values, fill them with zero using sale["客户名称"]=sale["客户名称"].fillna(0) , or drop rows lacking a customer code with sale.dropna(subset=["客户编码"]) .

6. Multi‑condition filtering – Example: find orders in Beijing by salesperson "张爱" with amount > 5000 using sale.loc[(sale["地区名称"]=="北京") & (sale["业务员名称"]=="张爱") & (sale["订单金额"]>5000)] .

7. Fuzzy filtering – Select rows where the product name contains "三星" or "索尼" with sale.loc[sale["存货名称"].str.contains("三星|索尼")] .

8. Group‑by aggregation – Compute total profit per region and salesperson: sale.groupby(["地区名称","业务员名称"])["利润"].sum() .

9. Binning (categorical grouping) – Create profit bins and labels, then assign groups using sale_area["分组"]=pd.cut(sale_area["利润"], bins, labels=groups) .

10. Business‑logic labeling – Tag high‑margin products as "优质商品" and low‑margin ones as "一般商品" with sale.loc[(sale["利润"]/sale["订单金额"])>0.3, "label"]="优质商品" and similar for <0.05.

The article concludes that both Excel and Python are valuable tools; the choice depends on the specific task, and a good analyst selects the most efficient method.

data analysisdata cleaningexcelpandasdata-manipulationVLOOKUP
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.