Fundamentals 15 min read

Master Seaborn: From Installation to Advanced Visualizations in Python

This tutorial introduces Seaborn—a Python statistical visualization library built on matplotlib—covers its advantages, installation methods, a step‑by‑step workflow for importing data, setting up the canvas, creating various plot types (histogram, scatter, bar, line, box, violin, heatmap, etc.), and demonstrates a practical example with full code snippets and visual outputs.

Python Crawling & Data Mining

Sep 25, 2020

Master Seaborn: From Installation to Advanced Visualizations in Python

Seaborn Overview

Seaborn is a statistical data‑visualization library built on matplotlib and tightly integrated with pandas, designed to make data exploration and understanding easier.

Advantages

Less code

Beautiful graphics

Full‑featured

Easy installation of mainstream modules

Installation

pip install matplotlib
pip install seaborn

pip install git+https://github.com/mwaskom/seaborn.git

Workflow

Import libraries

import matplotlib.pyplot as plt
import seaborn as sns

Jupyter inline display

%matplotlib inline

Load data

# Load built‑in dataset
 dataset = sns.load_dataset('dataset')
# Load external CSV
 dataset = pd.read_csv('dataset.csv')

Set canvas

# Create a 12×6 figure
plt.figure(figsize=(12, 6))

Plot example

# Set style (five options: "white", "dark", "whitegrid", "darkgrid", "ticks")
sns.set_style('white')
# Bar plot example
sns.barplot(x=x, y=y, data=dataset, ...)

Save figure

plt.savefig('jg.png')

Practical Example

# Data preparation
df = pd.read_csv('./cook.csv')  # dataset from "菜J学Python" public account
df['难度'] = df['用料数'].apply(lambda x: '简单' if x<5 else ('一般' if x<15 else '较难'))
# Keep required columns
df = df[['菜谱','用料','用料数','难度','菜系','评分','用户']]
# View a random sample of 5 rows
df.sample(5)

Histogram (distplot)

# Plot histogram with density curve (default)
plt.figure(figsize=(10,6))
rate = df['评分']
sns.distplot(rate, color="salmon", bins=20)
# Remove density curve
plt.figure(figsize=(10,6))
sns.distplot(rate, kde=False, color="salmon", bins=20)
# Add rug (marginal ticks) and compare two colors
fig,axes = plt.subplots(1,2,figsize=(10,6))
sns.distplot(rate, color="salmon", bins=10, ax=axes[0])
sns.distplot(rate, color="green", bins=10, rug=True, ax=axes[1])

Scatter Plot

# Basic scatter plot
sns.scatterplot(x="用料数", y="评分", hue="难度", data=df, ax=axes[0])
# Scatter with style differentiation
sns.scatterplot(x="用料数", y="评分", hue="难度", style="难度", data=df, ax=axes[1])

Strip Plot

# Strip plot with jitter
plt.figure(figsize=(10,6))
sns.stripplot(x="菜系", y="评分", hue="难度", jitter=1, data=df)

Swarm Plot

# Swarm plot (categorical scatter with distribution)
sns.swarmplot(x="菜系", y="评分", hue="难度", data=df)

Bar Plot

# Default bar plot (mean aggregation)
sns.barplot(x='菜系', y='评分', color="r", data=df, ax=axes[0])
# Bar plot with min aggregation
sns.barplot(x='菜系', y='评分', color="salmon", data=df, estimator=min, ax=axes[1])
# Bar plot with hue (difficulty)
sns.barplot(x='菜系', y='评分', color="salmon", hue='难度', data=df, ax=axes[0])
# Horizontal bar plot
sns.barplot(x='评分', y='菜系', color="salmon", hue='难度', data=df, ax=axes[1])

Count Plot

# Count plot for categorical frequencies
sns.countplot(x='菜系', color="salmon", data=df, ax=axes[0])
# Count plot with hue
sns.countplot(x='菜系', color="salmon", hue='难度', data=df, ax=axes[1])

Line Plot

# Line plot with aggregation (default)
sns.lineplot(x="用料数", y="评分", hue="菜系", data=df, ax=axes[0])
# Line plot without aggregation
sns.lineplot(x="用料数", y="评分", hue="菜系", estimator=None, data=df, ax=axes[1])

Box Plot

# Basic box plot
sns.boxplot(x='菜系', y='评分', hue='难度', data=df, ax=axes[0])
# Box plot with custom order and colors
sns.boxplot(x='菜系', y='评分', hue='难度', data=df, color="salmon", linewidth=1,
            order=['清真菜','粤菜','东北菜','鲁菜','浙菜','湖北菜','川菜'],
            hue_order=['简单','一般','较难'], ax=axes[1])

Boxen Plot

# Boxen plot (enhanced box plot)
sns.boxenplot(x='菜系', y='评分', hue='难度', data=df, color="salmon", ax=axes[0])
# Boxen plot with palette
sns.boxenplot(x='菜系', y='评分', hue='难度', data=df, palette="Set2", ax=axes[1])

Violin Plot

# Violin plot with custom line width
sns.violinplot(x='菜系', y='评分', data=df, color="salmon", linewidth=1, ax=axes[0])
# Violin plot with palette and inner sticks
sns.violinplot(x='菜系', y='评分', data=df, palette=sns.color_palette('Greens'), inner='stick', ax=axes[1])

Regression Plot

# regplot with custom marker and color
sns.regplot(x='用料数', y='评分', data=df, color='r', marker='+', ax=axes[0])
# regplot without confidence interval
sns.regplot(x='用料数', y='评分', data=df, ci=None, color='g', marker='*', ax=axes[1])

LM Plot

# lmplot with hue and multiple markers, no confidence interval
sns.lmplot(x='用料数', y='评分', hue='难度', data=df,
            palette=sns.color_palette('Reds'), ci=None,
            markers=['*','o','+'])

Heatmap

# Create pivot table for heatmap
h = pd.pivot_table(df, index='菜系', columns='难度', values='评分', aggfunc=np.mean)
# Simple heatmap
sns.heatmap(h, ax=axes[0])
# Heatmap with annotations and custom diverging palette
cmap = sns.diverging_palette(200, 20, s=100, l=50, as_cmap=True)
sns.heatmap(h, annot=True, cmap=cmap, ax=axes[1])
# Save figure
plt.savefig('jg.png')

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Matplotlib Seaborn Statistical Plots

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.