Reading and Writing Excel Files with Python: xlrd, xlwt, and openpyxl
This tutorial explains how to use Python's xlrd, xlwt, and openpyxl libraries to read, write, and append data in both XLS and XLSX Excel formats, providing complete code examples, performance considerations, and test results for each operation.
Reading and Writing Excel Files with Python
Compared with other programming languages, Python's biggest advantage is its simple syntax and rich third‑party libraries. With the rise of data analysis, Python has become one of the most popular languages, and Excel is the most common tool for data import and export besides databases.
1. Preliminary Preparation
1.1. Introduction to Excel
Microsoft Excel is a spreadsheet software for Windows and macOS.
1.2. Format Differences
Excel has two formats, XLS and XLSX, which differ in file structure, version, and compatibility:
File format: XLS is a binary format based on the compound document structure, while XLSX uses XML and compression, making it smaller.
Version: XLS is generated by Excel 2003 and earlier; XLSX is generated by Excel 2007 and later.
Compatibility: XLSX is backward compatible with XLS.
1.3. Library Usage
Python's standard library includes xlrd and xlwt for XLS files only. For XLSX files you need the third‑party openpyxl library.
1.4. Overall Idea
Using the above modules we can theoretically read and write any Excel format, but repeatedly iterating over each cell is inefficient. Therefore, six utility functions are provided to simplify common operations.
2. Code Demonstration
2.1. XLS Format
2.1.1. Read XLS File
<code>def read_xls_excel(url, index):
'''
Read XLS format file
Parameters:
url: file path
index: sheet number (starting from 1)
Returns:
data: list of rows
'''
# Open the workbook
workbook = xlrd.open_workbook(url)
# Get all sheet names
sheets = workbook.sheet_names()
# Get the specified sheet
worksheet = workbook.sheet_by_name(sheets[index-1])
data = []
for i in range(0, worksheet.nrows):
da = []
for j in range(0, worksheet.ncols):
da.append(worksheet.cell_value(i, j))
data.append(da)
return data
</code>2.1.2. Write XLS File
<code>def write_xls_excel(url, sheet_name, two_dimensional_data):
'''
Write XLS format file
Parameters:
url: file path
sheet_name: sheet name
two_dimensional_data: 2‑D list to write
'''
workbook = xlwt.Workbook()
sheet = workbook.add_sheet(sheet_name)
for i in range(0, len(two_dimensional_data)):
for j in range(0, len(two_dimensional_data[i])):
sheet.write(i, j, two_dimensional_data[i][j])
workbook.save(url)
print("Write successful")
</code>2.1.3. Append to XLS File
<code>def write_xls_excel_add(url, two_dimensional_data, index):
'''
Append data to an existing XLS file
Parameters:
url: file path
two_dimensional_data: 2‑D list to append
index: target sheet number (starting from 1)
'''
workbook = xlrd.open_workbook(url)
sheets = workbook.sheet_names()
worksheet = workbook.sheet_by_name(sheets[index-1])
rows_old = worksheet.nrows
new_workbook = copy(workbook)
new_worksheet = new_workbook.get_sheet(index-1)
for i in range(0, len(two_dimensional_data)):
for j in range(0, len(two_dimensional_data[i])):
new_worksheet.write(i + rows_old, j, two_dimensional_data[i][j])
new_workbook.save(url)
print("Append successful")
</code>2.2. XLSX Format
2.2.1. Read XLSX File
<code>def read_xlsx_excel(url, sheet_name):
'''
Read XLSX format file
Parameters:
url: file path
sheet_name: sheet name
Returns:
data: list of rows
'''
workbook = openpyxl.load_workbook(url)
sheet = workbook[sheet_name]
data = []
for row in sheet.rows:
da = []
for cell in row:
da.append(cell.value)
data.append(da)
return data
</code>2.2.2. Write XLSX File
<code>def write_xlsx_excel(url, sheet_name, two_dimensional_data):
'''
Write XLSX format file
Parameters:
url: file path
sheet_name: sheet name
two_dimensional_data: 2‑D list to write
'''
workbook = openpyxl.Workbook()
sheet = workbook.active
sheet.title = sheet_name
for i in range(0, len(two_dimensional_data)):
for j in range(0, len(two_dimensional_data[i])):
sheet.cell(row=i+1, column=j+1, value=str(two_dimensional_data[i][j]))
workbook.save(url)
print("Write successful")
</code>2.2.3. Append to XLSX File
<code>def write_xlsx_excel_add(url, sheet_name, two_dimensional_data):
'''
Append data to an XLSX file
Parameters:
url: file path
sheet_name: sheet name
two_dimensional_data: 2‑D list to append
'''
workbook = openpyxl.load_workbook(url)
sheet = workbook[sheet_name]
for tdd in two_dimensional_data:
sheet.append(tdd)
workbook.save(url)
print("Append successful")
</code>3. Test Results
3.1. Read Test
Two sample Excel files were prepared (see images). The read functions returned the expected data, confirming that reading works correctly.
3.2. Write Test
Writing overwrote the original data, as shown in the output image.
3.3. Append Test
Appending data succeeded without overwriting existing content, confirming the append functions work as intended.
4. Conclusion
Data analysis should focus on data itself rather than spending excessive time on Excel formatting. This guide provides concise Python utilities for efficient Excel I/O, which should be helpful for anyone learning Python.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.