Data Visualization and Exploratory Graphs with Pandas in Python
This article teaches how to use Python's pandas library to create various exploratory visualizations—line, scatter, histogram, bar, pie, and box charts—on real datasets such as Iris, the 2013 American Community Survey, and the Boston Housing data, including code examples and interpretation of results.
Data visualization is the graphical representation of data, enabling users to explore data characteristics, trends, and relationships through exploratory graphs.
The article first introduces common chart types—line, scatter, histogram, bar, pie, and box—and explains that pandas integrates Matplotlib's plotting functions, allowing direct use of methods like df_iris[["sepal length (cm)"]].plot.line() and df.plot.scatter(x='sepal length (cm)', y='sepal width (cm)') without importing Matplotlib explicitly.
Code examples demonstrate how to plot a line chart of the Iris dataset's sepal length, a scatter chart with variable point sizes and colors, histograms for multiple features, a pie chart of target class distribution, and a box plot for statistical summaries, e.g., df.target.value_counts().plot.pie(legend=True) and df.boxplot(column=['target'], figsize=(10,5)) .
Two real-world datasets are then used for practical exploration. The 2013 American Community Survey data is loaded ( df = pd.read_csv("./ss13husa.csv") ), its shape and descriptive statistics examined, and columns such as education level, income, and work status are concatenated and grouped to show distribution and average income ( group = df['ac_survey'].groupby(by=['SCHL']) ).
The Boston Housing dataset is loaded ( df = pd.read_csv("./housing.data") ) to visualize the distribution of house prices (MEDV) with a histogram, examine relationships between features and price using scatter plots ( df.plot.scatter(x='MEDV', y='RM') ), and compute Pearson correlation coefficients, which are displayed as a heatmap ( corr = df.corr(); import seaborn as sns; sns.heatmap(corr); plt.show() ) to highlight positive and negative associations.
All code snippets are provided unchanged within ... blocks, and the article includes illustrative images of each chart type.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.