Using pandas fillna() to Handle Missing Data: 10 Practical Examples
This article introduces pandas' fillna() method and demonstrates ten practical examples—including basic filling, column‑specific values, forward/backward filling, limiting fills, using other DataFrames, functions, conditional fills, dictionaries, and Series—to help developers effectively handle missing data in Python data analysis.
Dear developers, have you ever struggled with NaN or None values when processing data? In data science and analysis, these missing values can easily lead to biased results.
Fortunately, Python's pandas library offers the powerful fillna() method. Below are ten practical examples that illustrate how to master missing‑data handling.
Example 1: Basic Filling
import pandas as pd
import numpy as np
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
print(df.fillna(0))Example 2: Column‑Specific Filling
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
values = {'A': 0, 'B': 5}
print(df.fillna(value=values))Example 3: Forward Fill
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
print(df.fillna(method='ffill'))Example 4: Backward Fill
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
print(df.fillna(method='bfill'))Example 5: Limit Number of Fills
df = pd.DataFrame({
'A': [1, 2, np.nan, 4, np.nan],
'B': [5, np.nan, np.nan, 8, 9],
'C': [9, 10, 11, 12, 13]
})
print(df.fillna(method='ffill', limit=1))Example 6: Fill Using Another DataFrame
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
other_df = pd.DataFrame({
'A': [0],
'B': [5],
'C': [9]
})
print(df.fillna(other=other_df))Example 7: Fill with Computed Values (Mean)
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
mean_values = df.mean()
print(df.fillna(mean_values))Example 8: Conditional Fill per Column
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
print(df.fillna({col: df[col].mean() for col in df.columns}))Example 9: Dictionary Fill for Specific Columns
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
values_dict = {'A': 0, 'B': df['B'].mean()}
print(df.fillna(value=values_dict))Example 10: Fill Using a Series
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12]
})
fill_series = pd.Series([0, 5, 9], index=['A', 'B', 'C'])
print(df.fillna(fill_series))Through these examples, we learn both the basic usage of fillna() and how to choose the most appropriate filling strategy for different scenarios. Mastering this method is crucial for real‑world datasets and building reliable machine‑learning models.
Properly handling missing values improves data quality, prevents bias, and ensures accurate analysis results.
We’ll continue exploring more Python and data‑analysis techniques in future articles. Remember to select filling strategies carefully, as inappropriate choices can introduce bias, and consider performance optimizations for large datasets.
If you encounter any issues or want to share your own data‑cleaning tips, feel free to comment and join the discussion.
Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.