Advanced Pandas Techniques: Grouping, Aggregation, Window Functions, and More
This article demonstrates eleven practical Pandas examples covering grouping aggregation, conditional filtering, rolling windows, multi-indexing, melting, broadcasting, concatenation, merging, time-series creation, missing-value handling, and custom function application, each accompanied by complete Python code and expected output.
This guide presents a series of Pandas examples that illustrate common data‑analysis operations.
1. Grouping aggregation
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
grouped = df.groupby('Category').agg({'Value': ['sum', 'mean']})
print(grouped)2. Conditional aggregation
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
grouped_filtered = df.groupby('Category').filter(lambda x: x['Value'].mean() > 35)
grouped = grouped_filtered.groupby('Category').agg({'Value': ['sum', 'mean']})
print(grouped)3. Window function
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
rolling_mean = df['Value'].rolling(window=2).mean()
df['Rolling Mean'] = rolling_mean
print(df)4. Multi‑index
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
multi_index_df = df.set_index(['Category', df.index])
print(multi_index_df)5. Melt (reshape)
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
melted_df = pd.melt(df, id_vars=['Category'], value_vars=['Value'])
print(melted_df)6. Broadcasting
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
df['Value'] += 10
print(df)7. Concatenation
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
df2 = pd.DataFrame({
'Category': ['A', 'B'],
'Value': [70, 80]
})
concatenated = pd.concat([df, df2], ignore_index=True)
print(concatenated)8. Data merging (join)
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
df2 = pd.DataFrame({
'Category': ['A', 'B'],
'Value': [70, 80]
})
joined = df.merge(df2, on='Category', how='left')
print(joined)9. Time‑series creation
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
df['Date'] = pd.date_range(start='2024-01-01', periods=len(df), freq='D')
print(df)10. Missing‑value handling
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
df.loc[3, 'Value'] = None
df.fillna(0, inplace=True)
print(df)11. Custom function application
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60]
}
df = pd.DataFrame(data)
def add_one(x):
return x + 1
df['Value'] = df['Value'].apply(add_one)
print(df)Each code block produces the corresponding output shown later in the article, illustrating how Pandas can be used for a wide range of data‑manipulation tasks.
Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.