10 Practical Ways to Iterate and Transform a Pandas DataFrame in Python
This article demonstrates ten practical techniques for iterating over rows, columns, and values of a pandas DataFrame and applying common transformations such as apply, vectorized operations, map, mask, groupby, cumulative sum, and rolling calculations, each illustrated with concise Python code examples.
1. Iterate over DataFrame rows
import pandas as pd
# 创建示例 DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
# 遍历 DataFrame 的每一行
for index, row in df.iterrows():
print(f"Index: {index}, Row: {row}")2. Iterate over DataFrame columns
import pandas as pd
# 创建示例 DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
# 遍历 DataFrame 的每一列
for col_name in df.columns:
print(f"Column Name: {col_name}")3. Iterate over all values in a DataFrame
import pandas as pd
# 创建示例 DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
# 遍历 DataFrame 的所有值
for value in df.values.flatten():
print(f"Value: {value}")4. Use the apply() function
import pandas as pd
# 创建示例 DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
# 定义一个函数用于乘以2
def multiply_by_two(row):
return row * 2
# 应用函数到 DataFrame 上
result = df.apply(multiply_by_two)
print(result)5. Perform vectorized operations
import pandas as pd
# 创建示例 DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
# 直接对 DataFrame 的每一项乘以2
result = df * 2
print(result)6. Update values with map()
import pandas as pd
# 创建示例 DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
# 更新 DataFrame 中的值
df['A'] = df['A'].map(lambda x: x + 1)
print(df)7. Conditionally update values with mask()
import pandas as pd
# 创建示例 DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
# 条件性地更新 DataFrame 中的值
df['A'] = df['A'].mask(df['A'] > 2, 0)
print(df)8. Group data using groupby()
import pandas as pd
# 创建示例 DataFrame
data = {'Key': ['A', 'A', 'B', 'B', 'C'], 'Value': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)
# 使用 groupby 对 DataFrame 分组并计算每组的总和
grouped = df.groupby('Key')['Value'].sum()
print(grouped)9. Compute cumulative sum with cumsum()
import pandas as pd
# 创建示例 DataFrame
data = {'Key': ['A', 'A', 'B', 'B', 'C'], 'Value': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)
# 计算 Value 列的累计求和
df['CumulativeSum'] = df['Value'].cumsum()
print(df)10. Calculate rolling statistics with rolling()
import pandas as pd
# 创建示例 DataFrame
data = {'Key': ['A', 'A', 'B', 'B', 'C'], 'Value': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)
# 计算 Value 列的滚动平均值(窗口大小为2)
df['RollingMean'] = df['Value'].rolling(window=2).mean()
print(df)Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.