Reading CSV and Excel Files and Merging Data with pandas in Python
This tutorial demonstrates how to import CSV and Excel stock‑listing files using pandas, adjust parsing parameters, view data summaries, and combine multiple sheets into a single DataFrame through concatenation, providing a practical example of data import and merging in Python.
The article uses real stock‑listing data to teach how to read common data files in Python with the pandas library.
Reading CSV data
CSV files are comma‑separated and can be loaded with
import pandas as pd
nasdaq = pd.read_csv('nasdaq-listings.csv')
print(nasdaq.head(10)). The df.head(), df.info() methods display the first rows and a summary of the DataFrame.
To handle missing values and parse dates, the call can be refined as
nasdaq = pd.read_csv('nasdaq-listings.csv', na_values='NAN', parse_dates=['Last Update'])
print(nasdaq.head())
nasdaq.info(), which shows proper dtypes such as datetime64[ns] for the "Last Update" column.
Reading Excel data
Excel files can be read with
nyse = pd.read_excel('listings.xlsx', sheet_name='nyse', na_values='n/a')
print(nyse.head())
nyse.info(). The pd.ExcelFile() object provides access to all sheet names:
xls = pd.ExcelFile('listings.xlsx')
exchanges = xls.sheet_names
print(exchanges) # ['amex', 'nasdaq', 'nyse'].
All sheets can be loaded at once:
listings = pd.read_excel(xls, sheet_name=exchanges, na_values='n/a')
listings['nasdaq'].info().
Merging multiple tables
Because the sheets share the same column structure, they can be stacked with
# read two sheets
nyse = pd.read_excel('listings.xlsx', sheet_name='nyse', na_values='n/a')
nasdaq = pd.read_excel('listings.xlsx', sheet_name='nasdaq', na_values='n/a')
# add source label
nyse['Exchange'] = 'NYSE'
nasdaq['Exchange'] = 'NASDAQ'
# concatenate
combined_listings = pd.concat([nyse, nasdaq])
combined_listings.info(). For all sheets, a loop can collect each DataFrame into a list and then concatenate:
listings = []
for exchange in exchanges:
df = pd.read_excel(xls, sheet_name=exchange, na_values='n/a')
df['Exchange'] = exchange
listings.append(df)
listing_data = pd.concat(listings)
listing_data.info().
The final merged DataFrame contains 8 columns, including an "Exchange" column that identifies the original sheet for each row.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
