Fundamentals 6 min read

Master Pandas: Install, Import Data, Create DataFrames, and Analyze with Python

This tutorial walks through installing Pandas, importing CSV and Excel files, building DataFrames from dictionaries, describing data, indexing with loc/iloc, and applying custom functions to transform columns, providing clear code examples and visual outputs.

Model Perspective
Model Perspective
Model Perspective
Master Pandas: Install, Import Data, Create DataFrames, and Analyze with Python

Pandas Installation and Import

Install Pandas via pip or use the pre‑installed version in Anaconda, typically imported as pd:

pip install pandas
import pandas as pd

Data Import

Read external CSV or Excel files with pd.read_csv and pd.read_excel. Example: car = pd.read_csv('data/car-sales.csv') The file path is relative to the running script and the data is loaded as a DataFrame.

Creating a DataFrame

Construct a DataFrame from Python dictionaries:

make = ['Toyota','Honda','Toyota','BMW','Nissan','Toyota','Honda','Honda','Toyota','Nissan']
color = ['White','Red','Blue','Black','White','Green','Blue','Blue','White','White']
odometer = [150043,87899,32549,11179,213095,99213,45698,54738,60000,31600]
doors = [4,4,3,5,4,4,4,4,4,4]
price = ['$4,000.00','$5,000.00','$7,000.00','$22,000.00','$3,500.00','$4,500.00','$7,500.00','$7,000.00','$6,250.00','$9,700.00']
car = pd.DataFrame({'Make':make,'Colour':color,'Odometer (KM)':odometer,'Doors':doors,'Price':price})

The resulting table is displayed in Jupyter Notebook.

DataFrame example
DataFrame example

Data Description

Use describe to obtain summary statistics: car.describe() The output shows common statistics for numeric columns. To include categorical data, set the include parameter, e.g. car.describe(include=['object','float','int']). Missing values appear as NaN.

Describe output
Describe output
NaN illustration
NaN illustration

Data Indexing

Selecting Single or Multiple Columns

Use bracket notation similar to dictionary indexing:

car['Price']                     # single column
car[['Make','Colour']]            # multiple columns

Using loc and iloc

loc

indexes by label, while iloc indexes by integer position. Example retrieving the value at row label 1 and column 'Odometer (KM)' (or position 2):

car.loc[1,'Odometer (KM)']
car.loc[1,2]

Use : to select all rows or columns, e.g. the entire 'Odometer (KM)' column:

car.loc[:, 'Odometer (KM)']
car.loc[:, 2]

Using apply

Apply a custom function to transform a column. The following function removes dollar signs and commas from the Price column and converts it to float:

def to_num(x):
    x_new = x.replace('$','')
    x_new = x_new.replace(',','')
    return float(x_new)

Apply it to the Price column:

car['Price'].apply(to_num)
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

dataframepandasapplydata importdata-analysisloc
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.