Master Seaborn: From Installation to Advanced Visualizations in Python
This tutorial introduces Seaborn—a Python statistical visualization library built on matplotlib—covers its advantages, installation methods, a step‑by‑step workflow for importing data, setting up the canvas, creating various plot types (histogram, scatter, bar, line, box, violin, heatmap, etc.), and demonstrates a practical example with full code snippets and visual outputs.
Seaborn Overview
Seaborn is a statistical data‑visualization library built on matplotlib and tightly integrated with pandas, designed to make data exploration and understanding easier.
Advantages
Less code
Beautiful graphics
Full‑featured
Easy installation of mainstream modules
Installation
pip install matplotlib
pip install seaborn pip install git+https://github.com/mwaskom/seaborn.gitWorkflow
Import libraries
import matplotlib.pyplot as plt
import seaborn as snsJupyter inline display
%matplotlib inlineLoad data
# Load built‑in dataset
dataset = sns.load_dataset('dataset')
# Load external CSV
dataset = pd.read_csv('dataset.csv')Set canvas
# Create a 12×6 figure
plt.figure(figsize=(12, 6))Plot example
# Set style (five options: "white", "dark", "whitegrid", "darkgrid", "ticks")
sns.set_style('white')
# Bar plot example
sns.barplot(x=x, y=y, data=dataset, ...)Save figure
plt.savefig('jg.png')Practical Example
# Data preparation
df = pd.read_csv('./cook.csv') # dataset from "菜J学Python" public account
df['难度'] = df['用料数'].apply(lambda x: '简单' if x<5 else ('一般' if x<15 else '较难'))
# Keep required columns
df = df[['菜谱','用料','用料数','难度','菜系','评分','用户']]
# View a random sample of 5 rows
df.sample(5)Histogram (distplot)
# Plot histogram with density curve (default)
plt.figure(figsize=(10,6))
rate = df['评分']
sns.distplot(rate, color="salmon", bins=20)
# Remove density curve
plt.figure(figsize=(10,6))
sns.distplot(rate, kde=False, color="salmon", bins=20)
# Add rug (marginal ticks) and compare two colors
fig,axes = plt.subplots(1,2,figsize=(10,6))
sns.distplot(rate, color="salmon", bins=10, ax=axes[0])
sns.distplot(rate, color="green", bins=10, rug=True, ax=axes[1])Scatter Plot
# Basic scatter plot
sns.scatterplot(x="用料数", y="评分", hue="难度", data=df, ax=axes[0])
# Scatter with style differentiation
sns.scatterplot(x="用料数", y="评分", hue="难度", style="难度", data=df, ax=axes[1])Strip Plot
# Strip plot with jitter
plt.figure(figsize=(10,6))
sns.stripplot(x="菜系", y="评分", hue="难度", jitter=1, data=df)Swarm Plot
# Swarm plot (categorical scatter with distribution)
sns.swarmplot(x="菜系", y="评分", hue="难度", data=df)Bar Plot
# Default bar plot (mean aggregation)
sns.barplot(x='菜系', y='评分', color="r", data=df, ax=axes[0])
# Bar plot with min aggregation
sns.barplot(x='菜系', y='评分', color="salmon", data=df, estimator=min, ax=axes[1])
# Bar plot with hue (difficulty)
sns.barplot(x='菜系', y='评分', color="salmon", hue='难度', data=df, ax=axes[0])
# Horizontal bar plot
sns.barplot(x='评分', y='菜系', color="salmon", hue='难度', data=df, ax=axes[1])Count Plot
# Count plot for categorical frequencies
sns.countplot(x='菜系', color="salmon", data=df, ax=axes[0])
# Count plot with hue
sns.countplot(x='菜系', color="salmon", hue='难度', data=df, ax=axes[1])Line Plot
# Line plot with aggregation (default)
sns.lineplot(x="用料数", y="评分", hue="菜系", data=df, ax=axes[0])
# Line plot without aggregation
sns.lineplot(x="用料数", y="评分", hue="菜系", estimator=None, data=df, ax=axes[1])Box Plot
# Basic box plot
sns.boxplot(x='菜系', y='评分', hue='难度', data=df, ax=axes[0])
# Box plot with custom order and colors
sns.boxplot(x='菜系', y='评分', hue='难度', data=df, color="salmon", linewidth=1,
order=['清真菜','粤菜','东北菜','鲁菜','浙菜','湖北菜','川菜'],
hue_order=['简单','一般','较难'], ax=axes[1])Boxen Plot
# Boxen plot (enhanced box plot)
sns.boxenplot(x='菜系', y='评分', hue='难度', data=df, color="salmon", ax=axes[0])
# Boxen plot with palette
sns.boxenplot(x='菜系', y='评分', hue='难度', data=df, palette="Set2", ax=axes[1])Violin Plot
# Violin plot with custom line width
sns.violinplot(x='菜系', y='评分', data=df, color="salmon", linewidth=1, ax=axes[0])
# Violin plot with palette and inner sticks
sns.violinplot(x='菜系', y='评分', data=df, palette=sns.color_palette('Greens'), inner='stick', ax=axes[1])Regression Plot
# regplot with custom marker and color
sns.regplot(x='用料数', y='评分', data=df, color='r', marker='+', ax=axes[0])
# regplot without confidence interval
sns.regplot(x='用料数', y='评分', data=df, ci=None, color='g', marker='*', ax=axes[1])LM Plot
# lmplot with hue and multiple markers, no confidence interval
sns.lmplot(x='用料数', y='评分', hue='难度', data=df,
palette=sns.color_palette('Reds'), ci=None,
markers=['*','o','+'])Heatmap
# Create pivot table for heatmap
h = pd.pivot_table(df, index='菜系', columns='难度', values='评分', aggfunc=np.mean)
# Simple heatmap
sns.heatmap(h, ax=axes[0])
# Heatmap with annotations and custom diverging palette
cmap = sns.diverging_palette(200, 20, s=100, l=50, as_cmap=True)
sns.heatmap(h, annot=True, cmap=cmap, ax=axes[1])
# Save figure
plt.savefig('jg.png')Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
