Fundamentals 8 min read

35 Essential Pandas Operations Every Data Analyst Should Master

This guide walks you through 35 core Pandas commands—from creating DataFrames and inspecting data to filtering, grouping, merging, pivoting, and iterating—providing concise examples that enable quick, effective data manipulation and analysis in Python.

Test Development Learning Exchange

Dec 5, 2024

35 Essential Pandas Operations Every Data Analyst Should Master

Pandas is one of the most widely used Python libraries for data processing, offering powerful data structures and analysis tools. The following 35 basic operations cover the essential techniques you need to handle, explore, and transform tabular data efficiently.

1. Import library and create a DataFrame

First import Pandas and build a simple DataFrame from a dictionary:

import pandas as pd

data = {
    '姓名': ['张三', '李四', '王五', '赵六'],
    '年龄': [25, 30, 35, 40],
    '性别': ['男', '女', '男', '女']
}

df = pd.DataFrame(data)

2. View the first 5 rows

print(df.head())

3. Inspect DataFrame information

print(df.info())

4. Get descriptive statistics

print(df.describe())

5. List column names

print(df.columns)

6. Show the index

print(df.index)

7. Select a column

print(df['姓名'])

8. Modify a column

df['姓名'] = ['张三丰', '李四光', '王五岳', '赵六令']
print(df)

9. Filter rows by condition

filtered_df = df[df['年龄'] > 30]
print(filtered_df)

10. Update values based on condition

df.loc[df['年龄'] > 30, '年龄'] = 31
print(df)

11. Delete rows by condition

df = df[df['年龄'] != 31]
print(df)

12. Add a new column

df['城市'] = ['北京', '上海', '广州', '深圳']
print(df)

13. Drop a column

df = df.drop('城市', axis=1)
print(df)

14. Rename a column

df = df.rename(columns={'姓名': '名字'})
print(df)

15. Set a column as the index

df = df.set_index('名字')
print(df)

16. Reset the index

df = df.reset_index()
print(df)

17. Sort by a column

sorted_df = df.sort_values(by='年龄')
print(sorted_df)

18. Group by a column and compute mean

grouped_df = df.groupby('性别')
print(grouped_df.mean())

19. Concatenate two DataFrames (vertical)

data2 = {'名字': ['孙悟空', '猪八戒'], '年龄': [500, 400], '性别': ['男', '男']}

df2 = pd.DataFrame(data2)
merged_df = pd.concat([df, df2])
print(merged_df)

20. Merge two DataFrames on a column

df3 = pd.DataFrame({'名字': ['白龙马'], '年龄': [300], '性别': ['男']})
merged_df = pd.merge(df, df3, on='名字')
print(merged_df)

21. Merge two DataFrames by index

merged_df = pd.merge(df, df3, left_index=True, right_index=True)
print(merged_df)

22. Concatenate DataFrames horizontally (column‑wise)

connected_df = pd.concat([df, df3], axis=1)
print(connected_df)

23. Slice rows

sliced_df = df[1:3]
print(sliced_df)

24. Iterate over rows

for index, row in df.iterrows():
    print(row)

25. Filter with multiple conditions

filtered_df = df[(df['年龄'] > 25) & (df['性别'] == '男')]
print(filtered_df)

26. Replace values in a column

df['性别'] = df['性别'].replace('男', 'M')
print(df)

27. Map values using a dictionary

mapping = {'男': 'M', '女': 'F'}
df['性别'] = df['性别'].map(mapping)
print(df)

28. Create a simple pivot table

pivot_table = pd.pivot_table(df, values='年龄', index='名字', columns='性别')
print(pivot_table)

29. Pivot table with multi‑level index

pivot_table = pd.pivot_table(df, values='年龄', index=['名字', '性别'])
print(pivot_table)

30. Pivot table with aggregation function

pivot_table = pd.pivot_table(df, values='年龄', index='名字', columns='性别', aggfunc='mean')
print(pivot_table)

31. Fill missing values

df['年龄'] = df['年龄'].fillna(30)
print(df)

32. Drop duplicate rows

df = df.drop_duplicates()
print(df)

33. Insert a row at a specific position

df.loc[1.5] = ['唐僧', 25, '男']
df = df.sort_index().reset_index(drop=True)
print(df)

34. Delete rows by index

df = df.drop([1, 2])
print(df)

35. Transpose the DataFrame

transposed_df = df.T
print(transposed_df)

By mastering these 35 operations—covering data creation, inspection, selection, modification, aggregation, merging, pivoting, and iteration—you can efficiently clean, transform, and analyze datasets using Pandas in real‑world projects.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

data analysis tutorial dataframe Pandas data manipulation

Written by

Test Development Learning Exchange

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.