Fundamentals 8 min read

Master Python Data Visualization: Line, Scatter, Histogram, and Heatmap Techniques

This guide walks you through creating various Python data visualizations—including line charts, scatter plots, histograms, bar, pie, and heatmaps—using pandas and seaborn, demonstrates code examples with the Iris, American Community Survey, and Boston housing datasets, and explains how to interpret the results.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Master Python Data Visualization: Line, Scatter, Histogram, and Heatmap Techniques

Data visualization is the use of graphics or tables to present data. Charts can clearly show data properties and relationships, allowing users to explore data through exploratory graphs to understand characteristics, discover trends, and lower the barrier to comprehension.

Common Chart Types

In this tutorial we use pandas, which integrates matplotlib's plotting methods, so you can plot without directly importing matplotlib.

1. Line Chart

Line charts are basic and show relationships between continuous variables. Use plot.line() and call plt.show() to display.

df_iris[['sepal length (cm)']].plot.line()
plt.show()
ax = df[['sepal length (cm)']].plot.line(color='green', title='Demo', style='--')
ax.set(xlabel='index', ylabel='length')
plt.show()

2. Scatter Chart

Scatter charts examine relationships between discrete variables. Use df.plot.scatter().

df = df_iris
df.plot.scatter(x='sepal length (cm)', y='sepal width (cm)')
plt.show()
cmap = cm.get_cmap('Spectral')
df.plot.scatter(x='sepal length (cm)', y='sepal width (cm)',
                s=df[['petal length (cm)']]*20,
                c=df['target'], cmap=cmap,
                title='different circle size by petal length (cm)')

3. Histogram and Bar Chart

Histograms show the distribution of a single column; bar charts compare categories.

df[[ 'sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']].plot.hist()
df.target.value_counts().plot.bar()

4. Pie Chart and Box Plot

Pie charts display the proportion of categories; box plots show distribution and allow comparison across groups.

df.target.value_counts().plot.pie(legend=True)
df.boxplot(column=['target'], figsize=(10,5))

Practical Data Exploration

2013 American Community Survey

The survey includes about 3.5 million households each year, covering ancestry, education, work, transport, internet use, and residence.

# Read data
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("./ss13husa.csv")
print(df.shape)  # (756065, 231)
print(df.describe())

# Concatenate two parts
pusa = pd.read_csv("ss13pusa.csv")
pusb = pd.read_csv("ss13pusb.csv")
col = ['SCHL', 'PINCP', 'ESR']
ac_survey = pd.concat([pusa[col], pusb[col]], axis=0)

df['ac_survey'] = ac_survey

# Group by education and compute distribution and average income
group = df['ac_survey'].groupby('SCHL')
print('Education distribution:' + str(group.size()))
print('Average income:' + str(group.mean()))

Boston House Price Dataset

This dataset contains 506 samples with 13 features about Boston-area houses.

# Read data
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv("./housing.data")
print(df.shape)  # (506, 14)
print(df.describe())

# Histogram of house price (MEDV)
df['MEDV'].plot.hist()
plt.show()

# Scatter plot of price vs. average number of rooms
df.plot.scatter(x='MEDV', y='RM')
plt.show()

# Correlation heatmap
corr = df.corr()
sns.heatmap(corr)
plt.show()

The heatmap colors indicate correlation strength: red for positive, blue for negative, and white for little or no correlation. RM shows a strong positive relationship with price, while LSTAT and PTRATIO show strong negative relationships; CRIM, RAD, and AGE show little correlation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MatplotlibSeabornexploratory data analysis
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.