Using Python to Process Excel, Word, and CSV Files
This tutorial demonstrates how to install and use Python libraries such as xlrd, xlwt, xlutils, XlsxWriter, OpenPyXL, python-docx, and the csv module to read, write, modify, and format Excel, Word, and CSV files, including code examples and feature comparisons.
Follow and star the account to receive daily Python skill updates
Common Tools
Data processing is a major application scenario for Python, and Excel is the most popular data‑processing software. When working with Excel in Python, many ready‑made libraries are available, such as xlrd & xlwt & xlutils , XlsxWriter , OpenPyXL , and the Microsoft Excel COM interface. The table below compares their features:
Type
xlrd&xlwt&xlutils
XlsxWriter
OpenPyXL
Excel Open Interface
Read
Supported
Not supported
Supported
Supported
Write
Supported
Supported
Supported
Supported
Modify
Supported
Not supported
Supported
Supported
xls
Supported
Not supported
Not supported
Supported
xlsx
High version
Supported
Supported
Supported
Large file
Not supported
Supported
Supported
Not supported
Efficiency
Fast
Fast
Fast
Very slow
Functionality
Weak
Powerful
Average
Very powerful
Based on different needs, you can choose the appropriate tool. Below we mainly introduce the most commonly used xlrd & xlwt & xlutils series.
xlrd & xlwt & xlutils Introduction
The xlrd & xlwt & xlutils suite consists of three libraries:
xlrd : used for reading Excel files;
xlwt : used for writing Excel files;
xlutils : provides utility functions such as copy, split, and filter for Excel files.
Installation
Installation is straightforward using pip:
<code>$ pip install xlrd xlwt xlutils</code>Writing Excel
Below is a simple example that creates an Excel file with two sheets and writes sample data:
<code># Import xlwt
import xlwt
# Create workbook
wb = xlwt.Workbook()
# Add two sheets
sh1 = wb.add_sheet('成绩')
sh2 = wb.add_sheet('汇总')
# Write header and data (row, column)
sh1.write(0, 0, '姓名')
sh1.write(0, 1, '专业')
sh1.write(0, 2, '科目')
sh1.write(0, 3, '成绩')
sh1.write(1, 0, '张三')
sh1.write(1, 1, '信息与通信工程')
sh1.write(1, 2, '数值分析')
sh1.write(1, 3, 88)
sh1.write(2, 0, '李四')
sh1.write(2, 1, '物联网工程')
sh1.write(2, 2, '数字信号处理分析')
sh1.write(2, 3, 95)
sh1.write(3, 0, '王华')
sh1.write(3, 1, '电子与通信工程')
sh1.write(3, 2, '模糊数学')
sh1.write(3, 3, 90)
# Write summary sheet
sh2.write(0, 0, '总分')
sh2.write(1, 0, 273)
# Save file
wb.save('test.xls')
</code>Running the code generates test.xls . Opening the file shows the data as illustrated in the screenshots.
Reading Excel
Reading is equally simple:
<code># Import xlrd
import xlrd
# Open the previously created file
wb = xlrd.open_workbook('test_w.xls')
# Print number of sheets
print('Sheet count:', wb.nsheets)
# Print sheet names
print('Sheet names:', wb.sheet_names())
# Access the first sheet by index
sh1 = wb.sheet_by_index(0)
# Print rows and columns
print('Sheet %s has %d rows and %d columns' % (sh1.name, sh1.nrows, sh1.ncols))
# Print a specific cell value
print('Value of first row, second column:', sh1.cell_value(0, 1))
# Get entire row and column values
rows = sh1.row_values(0)
cols = sh1.col_values(1)
print('First row values:', rows)
print('Second column values:', cols)
# Print cell type (0‑empty, 1‑string, 2‑number, 3‑date, 4‑boolean, 5‑error)
print('Cell type of row 2, column 1:', sh1.cell(1, 0).ctype)
# Iterate over all sheets and rows
for sh in wb.sheets():
for r in range(sh.nrows):
print(sh.row(r))
</code>The output includes a table of cell‑type numeric codes. The mapping is shown below:
Value
Type
Description
0
empty
Empty cell
1
string
String value
2
number
Numeric value
3
date
Date value
4
boolean
Boolean value
5
error
Error cell
Modifying Excel
To modify an existing file, use xlutils.copy to create a writable copy, then apply changes:
<code># Import modules
import xlrd
from xlutils.copy import copy
# Open original file
readbook = xlrd.open_workbook('test_w.xls')
# Make a copy
wb = copy(readbook)
# Get first sheet and write new data in row 5
sh1 = wb.get_sheet(0)
sh1.write(4, 0, '王欢')
sh1.write(4, 1, '通信工程')
sh1.write(4, 2, '机器学习')
sh1.write(4, 3, 89)
# Get second sheet and replace total score
sh2 = wb.get_sheet(1)
sh2.write(1, 0, 362)
# Save modified file
wb.save('test.xls')
</code>Format Conversion Operations
The following example shows how to apply styles such as font color, number formatting, date formatting, cell alignment, merging, and formulas:
<code># Import xlwt
import xlwt
# Define styles
styleBR = xlwt.easyxf('font: name Times New Roman, color-index red, bold on')
styleNum = xlwt.easyxf(num_format_str='#,##0.00')
styleDate = xlwt.easyxf(num_format_str='YYYY-MM-DD')
# Create workbook and sheets
wb = xlwt.Workbook()
sh1 = wb.add_sheet('成绩')
sh2 = wb.add_sheet('汇总')
# Write header with red bold style
sh1.write(0, 0, '姓名', styleBR)
sh1.write(0, 1, '日期', styleBR)
sh1.write(0, 2, '成绩', styleBR)
# Insert data
sh1.write(1, 0, '张三')
sh1.write(1, 1, '2020-07-01', styleDate)
sh1.write(1, 2, 90, styleNum)
sh1.write(2, 0, '李四')
sh1.write(2, 1, '2020-08-02')
sh1.write(2, 2, 95, styleNum)
# Center alignment
alignment = xlwt.Alignment()
alignment.horz = xlwt.Alignment.HORZ_CENTER
style = xlwt.XFStyle()
style.alignment = alignment
# Merge cells and write total label
sh1.write_merge(3, 3, 0, 1, '总分', style)
# Write formula for sum
sh1.write(3, 2, xlwt.Formula('C2+C3'))
# Write summary sheet
sh2.write(0, 0, '总分', styleBR)
sh2.write(1, 0, 185)
# Save file
wb.save('test.xls')
</code>Python Operating Word
Installing python-docx
To manipulate Word documents, install the python-docx library (version 0.8.10 at the time of writing):
<code>$ pip install python-docx</code>Writing Word
Creating a document, adding a title, headings, paragraphs, and saving:
<code># Import libraries
from docx import Document
from docx.shared import Pt, Inches
from docx.oxml.ns import qn
# Create a new document
doc1 = Document()
# Add a title (level 0)
doc1.add_heading('How to Create and Operate Word with Python', 0)
# Save the document
doc1.save('word1.docx')
</code>The resulting file displays the title as shown in the screenshot.
Headings and Paragraphs
Below is an example that adds headings of different levels, paragraphs, and saves the file:
<code>from docx import Document
doc1 = Document()
doc1.add_heading('How to Create and Operate Word with Python', 0)
doc1.add_paragraph('Word documents are widely used in daily life and work. We usually edit them with WPS or Office, but they can also be generated programmatically with Python.')
doc1.add_heading('Install python-docx', 1)
doc1.add_paragraph('The installation involves two steps:')
doc1.add_heading('Step 1: Install Python', 2)
doc1.add_paragraph('Download the installer from the official Python website and install it.')
doc1.add_heading('Step 2: Install python-docx', 3)
doc1.add_paragraph('Open CMD (Win+R) and run "pip install python-docx".')
doc1.save('word2.docx')
</code>Font and Citation
The following code demonstrates how to set font size, style, color, and add a block quote:
<code>from docx import Document
from docx.shared import Pt, RGBColor
from docx.oxml.ns import qn
doc1 = Document()
doc1.add_heading('How to Use Python to Create and Operate Word', 0)
doc1.add_paragraph('Word documents are widely used...')
doc1.add_heading('Install python-docx', 1)
doc1.add_paragraph('Installation steps...')
# Add a paragraph with custom font size
paragraph = doc1.add_paragraph('This is the installation description!')
run = paragraph.add_run('(Note: font size set to 20)')
run.font.size = Pt(20)
# Set English font
run = doc1.add_paragraph('Here we set the English font: ').add_run('This Font is Times New Roman')
run.font.name = 'Times New Roman'
# Set Chinese font
run = doc1.add_paragraph('Here we set the Chinese font: ').add_run('Current font is SimHei')
run.font.name = 'SimHei'
run._element.rPr.rFonts.set(qn('w:eastAsia'), 'SimHei')
# Italic, bold, underline, color
run = doc1.add_paragraph('This segment: ').add_run('Italic text')
run.italic = True
run = doc1.add_paragraph('This segment: ').add_run('Bold text')
run.bold = True
run = doc1.add_paragraph('This segment: ').add_run('Underlined text')
run.underline = True
run = doc1.add_paragraph('This segment: ').add_run('Red text')
run.font.color.rgb = RGBColor(0xFF, 0x00, 0x00)
# Add a block quote
doc1.add_paragraph('Use Python to change life, change the world, FIGHTING.', style='Intense Quote')
doc1.save('word2.docx')
</code>Lists
Unordered and ordered lists can be created as follows:
<code>from docx import Document
doc2 = Document()
doc2.add_paragraph('Which of the following is not an animal:')
# Unordered list
for item in ['Apple', 'Xiyangyang', 'Lanyangyang', 'Feiyangyang', 'Grey Wolf']:
doc2.add_paragraph(item, style='List Bullet')
# Ordered list
for item in ['Reach CSDN blog expert', 'Exercise three days a week', 'Learn a new topic daily', 'Read 50 books', 'Reduce overtime']:
doc2.add_paragraph(item, style='List Number')
doc2.save('word1.docx')
</code>Images and Tables
Insert images and tables into a Word document:
<code>from docx import Document
from docx.shared import Inches
doc2 = Document()
doc2.add_heading('Images', level=2)
# Add picture (adjust path as needed)
doc2.add_picture('C:/Users/Y/Pictures/python-logo.png', width=Inches(5.5))
# Add table
doc2.add_heading('Table', level=2)
table = doc2.add_table(rows=1, cols=4)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = 'ID'
hdr_cells[1].text = 'Name'
hdr_cells[2].text = 'Occupation'
records = (
(1, '张三', '电工'),
(2, '张五', '老板'),
(3, '马六', 'IT'),
(4, '李四', '工程师')
)
for id, name, work in records:
row_cells = table.add_row().cells
row_cells[0].text = str(id)
row_cells[1].text = name
row_cells[2].text = work
# Manual page break
doc2.add_page_break()
doc2.save('word1.docx')
</code>Reading Word Files
The following code reads paragraphs and tables from existing Word documents:
<code>from docx import Document
# Open first document
doc1 = Document('word1.docx')
paragraphs = [p.text for p in doc1.paragraphs]
print('###### Content of word1.docx ######')
for p in paragraphs:
print(p)
# Open second document
doc2 = Document('word2.docx')
print('\n###### Content of word2.docx ######')
paragraphs2 = [p.text for p in doc2.paragraphs]
for p in paragraphs2:
print(p)
# Read tables
for table in doc2.tables:
for row in table.rows:
for cell in row.cells:
print(cell.text, end=' ')
print()
print('\n')
</code>Python Operating CSV
Introduction
CSV (Comma‑Separated Values) stores tabular data as plain text. It is widely used for data exchange between programs because of its simplicity and low memory consumption.
CSV vs Excel
CSV
Excel
File extension .csv
File extensions .xls or .xlsx
Plain‑text file
Binary file
Stores data only, no formatting or formulas
Can store data, formatting, formulas, etc.
Can be opened with Excel or any text editor
Can only be opened with Excel‑compatible tools
Header row written once
Each column has explicit start and end markers
Consumes less memory when importing
Consumes more memory
Basic Usage
The csv module provides reader , writer , DictReader , and DictWriter classes for reading and writing CSV files.
writer(csvfile, dialect='excel', **fmtparams)
Creates a writer object that converts user data into delimited strings.
<code>import csv
with open('test.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['id', 'name', 'age'])
data = [
('1001', '张三', '21'),
('1002', '李四', '31')
]
writer.writerows(data)
</code>reader(csvfile, dialect='excel', **fmtparams)
Creates a reader object that iterates over rows in the CSV file.
<code>import csv
with open('test.csv', newline='') as csvfile:
reader = csv.reader(csvfile, delimiter=' ')
for row in reader:
print(', '.join(row))
</code>Sniffer Class
Detects CSV dialect and whether a file has a header:
<code>import csv
with open('test.csv', newline='') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)
for row in reader:
print(row)
</code>Reader Object
Provides next() , dialect , line_num , and for DictReader also fieldnames .
Writer Object
Provides writerow(row) , writerows(rows) , writeheader() , and dialect .
Read/Write/Append Modes
<code>'r' : read
'w' : write (truncates existing file)
'a' : append
'r+' : read/write (file must exist)
'w+' : read/write (creates file if missing)
'a+' : append/read (creates file if missing)
Binary modes add a 'b' suffix, e.g., 'rb', 'wb', 'ab', etc.
</code>Python‑excel official site: www.python-excel.org/
Promotional QR Code: Scan the QR code below to receive free Python public‑course materials, including e‑books, tutorials, project templates, source code, and more.
Scan the QR code to get the resources for free
Click Read Original for more details.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.