Fundamentals 8 min read

Data Visualization and Exploratory Graphs with Pandas

This tutorial explains how to use Pandas for data visualization and exploratory analysis, covering line, scatter, histogram, bar, pie, box, and heatmap charts with code examples on the Iris, American Community Survey, and Boston Housing datasets.

Python Programming Learning Circle

Mar 8, 2024

Data Visualization and Exploratory Graphs with Pandas

Data visualization presents data using graphics or tables, allowing clear insight into data properties and relationships; exploratory graphs help users understand characteristics, discover trends, and lower the barrier to data comprehension.

Common chart examples are demonstrated using Pandas, which integrates Matplotlib methods directly into DataFrames, so explicit Matplotlib imports are unnecessary.

1. Line chart shows continuous relationships between columns. Example:

df_iris[["sepal length (cm)"]].plot.line()
plt.show()
ax = df[["sepal length (cm)"]].plot.line(color="green", title="Demo", style="--")
ax.set(xlabel="index", ylabel="length")
plt.show()

2. Scatter chart examines relationships between discrete variables:

df = df_iris
df.plot.scatter(x='sepal length (cm)', y='sepal width (cm)')
from matplotlib import cm
cmap = cm.get_cmap('Spectral')
df.plot.scatter(x='sepal length (cm)', y='sepal width (cm)', s=df[['petal length (cm)']]*20, c=df['target'], cmap=cmap, title='different circle size by petal length (cm)')
plt.show()

3. Histogram / Bar chart display distribution of a single column or compare categories:

df[["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"]].plot.hist()
df.target.value_counts().plot.bar()
plt.show()

4. Pie chart / Box plot illustrate proportion of categories and distribution differences:

df.target.value_counts().plot.pie(legend=True)
df.boxplot(column=['target'], figsize=(10,5))
plt.show()

Practical data exploration is then shown with two real datasets.

1. 2013 American Community Survey – a large census dataset (≈3.5 million households). After loading the CSV, the shape and descriptive statistics are inspected, then selected columns (SCHL, PINCP, ESR) are concatenated from two files and grouped by education level to examine distribution and average income.

# Read data
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("./ss13husa.csv")
print(df.shape)  # (756065, 231)
print(df.describe())

# Concatenate two parts
pusa = pd.read_csv("ss13pusa.csv")
pusb = pd.read_csv("ss13pusb.csv")
cols = ['SCHL', 'PINCP', 'ESR']
ac_survey = pd.concat([pusa[cols], pusb[cols]], axis=0)

group = ac_survey.groupby('SCHL')
print('Education distribution:', group.size())
print('Average income:', group.mean())

2. Boston Housing dataset – 506 samples with 13 features. After loading, the shape and descriptive statistics are shown, a histogram of the target variable (MEDV) is plotted, scatter plots explore relationships (e.g., MEDV vs. RM), and a Pearson correlation matrix is visualized with a heatmap.

# Load Boston housing data
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv("./housing.data")
print(df.shape)  # (506, 14)
print(df.describe())

# Histogram of house price
df[['MEDV']].plot.hist()
plt.show()

# Scatter plot of price vs. number of rooms
df.plot.scatter(x='MEDV', y='RM')
plt.show()

# Correlation heatmap
corr = df.corr()
sns.heatmap(corr)
plt.show()

These examples illustrate how Pandas can be used for quick visual exploration of datasets, helping identify key variables, relationships, and patterns before deeper modeling or analysis.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Visualization Pandas charts exploratory analysis

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.