Fundamentals 31 min read

Using Python to Process Excel, Word, and CSV Files

This tutorial demonstrates how to install and use Python libraries such as xlrd, xlwt, xlutils, XlsxWriter, OpenPyXL, python-docx, and the csv module to read, write, modify, and format Excel, Word, and CSV files, including code examples and feature comparisons.

Python Programming Learning Circle

Apr 16, 2024

Using Python to Process Excel, Word, and CSV Files

Follow and star the account to receive daily Python skill updates

Common Tools

Data processing is a major application scenario for Python, and Excel is the most popular data‑processing software. When working with Excel in Python, many ready‑made libraries are available, such as xlrd & xlwt & xlutils, XlsxWriter, OpenPyXL, and the Microsoft Excel COM interface. The table below compares their features:

Type

xlrd&xlwt&xlutils

XlsxWriter

OpenPyXL

Excel Open Interface

Read

Supported

Not supported

Supported

Write

Supported

Modify

Supported

Not supported

Supported

xls

Supported

Not supported

Supported

xlsx

High version

Supported

Large file

Not supported

Supported

Not supported

Efficiency

Fast

Very slow

Functionality

Weak

Powerful

Average

Very powerful

Based on different needs, you can choose the appropriate tool. Below we mainly introduce the most commonly used xlrd & xlwt & xlutils series.

xlrd & xlwt & xlutils Introduction

The xlrd & xlwt & xlutils suite consists of three libraries: xlrd: used for reading Excel files; xlwt: used for writing Excel files; xlutils: provides utility functions such as copy, split, and filter for Excel files.

Installation

Installation is straightforward using pip:

$ pip install xlrd xlwt xlutils

Writing Excel

Below is a simple example that creates an Excel file with two sheets and writes sample data:

# Import xlwt
import xlwt

# Create workbook
wb = xlwt.Workbook()

# Add two sheets
sh1 = wb.add_sheet('成绩')
sh2 = wb.add_sheet('汇总')

# Write header and data (row, column)
sh1.write(0, 0, '姓名')
sh1.write(0, 1, '专业')
sh1.write(0, 2, '科目')
sh1.write(0, 3, '成绩')

sh1.write(1, 0, '张三')
sh1.write(1, 1, '信息与通信工程')
sh1.write(1, 2, '数值分析')
sh1.write(1, 3, 88)

sh1.write(2, 0, '李四')
sh1.write(2, 1, '物联网工程')
sh1.write(2, 2, '数字信号处理分析')
sh1.write(2, 3, 95)

sh1.write(3, 0, '王华')
sh1.write(3, 1, '电子与通信工程')
sh1.write(3, 2, '模糊数学')
sh1.write(3, 3, 90)

# Write summary sheet
sh2.write(0, 0, '总分')
sh2.write(1, 0, 273)

# Save file
wb.save('test.xls')

Running the code generates test.xls. Opening the file shows the data as illustrated in the screenshots.

Reading Excel

Reading is equally simple:

# Import xlrd
import xlrd

# Open the previously created file
wb = xlrd.open_workbook('test_w.xls')

# Print number of sheets
print('Sheet count:', wb.nsheets)

# Print sheet names
print('Sheet names:', wb.sheet_names())

# Access the first sheet by index
sh1 = wb.sheet_by_index(0)

# Print rows and columns
print('Sheet %s has %d rows and %d columns' % (sh1.name, sh1.nrows, sh1.ncols))

# Print a specific cell value
print('Value of first row, second column:', sh1.cell_value(0, 1))

# Get entire row and column values
rows = sh1.row_values(0)
cols = sh1.col_values(1)
print('First row values:', rows)
print('Second column values:', cols)

# Print cell type (0‑empty, 1‑string, 2‑number, 3‑date, 4‑boolean, 5‑error)
print('Cell type of row 2, column 1:', sh1.cell(1, 0).ctype)

# Iterate over all sheets and rows
for sh in wb.sheets():
    for r in range(sh.nrows):
        print(sh.row(r))

The output includes a table of cell‑type numeric codes. The mapping is shown below:

Value

Type

Description

empty

Empty cell

string

String value

number

Numeric value

date

Date value

boolean

Boolean value

error

Error cell

Modifying Excel

To modify an existing file, use xlutils.copy to create a writable copy, then apply changes:

# Import modules
import xlrd
from xlutils.copy import copy

# Open original file
readbook = xlrd.open_workbook('test_w.xls')

# Make a copy
wb = copy(readbook)

# Get first sheet and write new data in row 5
sh1 = wb.get_sheet(0)
sh1.write(4, 0, '王欢')
sh1.write(4, 1, '通信工程')
sh1.write(4, 2, '机器学习')
sh1.write(4, 3, 89)

# Get second sheet and replace total score
sh2 = wb.get_sheet(1)
sh2.write(1, 0, 362)

# Save modified file
wb.save('test.xls')

Format Conversion Operations

The following example shows how to apply styles such as font color, number formatting, date formatting, cell alignment, merging, and formulas:

# Import xlwt
import xlwt

# Define styles
styleBR = xlwt.easyxf('font: name Times New Roman, color-index red, bold on')
styleNum = xlwt.easyxf(num_format_str='#,##0.00')
styleDate = xlwt.easyxf(num_format_str='YYYY-MM-DD')

# Create workbook and sheets
wb = xlwt.Workbook()
sh1 = wb.add_sheet('成绩')
sh2 = wb.add_sheet('汇总')

# Write header with red bold style
sh1.write(0, 0, '姓名', styleBR)
sh1.write(0, 1, '日期', styleBR)
sh1.write(0, 2, '成绩', styleBR)

# Insert data
sh1.write(1, 0, '张三')
sh1.write(1, 1, '2020-07-01', styleDate)
sh1.write(1, 2, 90, styleNum)
sh1.write(2, 0, '李四')
sh1.write(2, 1, '2020-08-02')
sh1.write(2, 2, 95, styleNum)

# Center alignment
alignment = xlwt.Alignment()
alignment.horz = xlwt.Alignment.HORZ_CENTER
style = xlwt.XFStyle()
style.alignment = alignment

# Merge cells and write total label
sh1.write_merge(3, 3, 0, 1, '总分', style)

# Write formula for sum
sh1.write(3, 2, xlwt.Formula('C2+C3'))

# Write summary sheet
sh2.write(0, 0, '总分', styleBR)
sh2.write(1, 0, 185)

# Save file
wb.save('test.xls')

Python Operating Word

Installing python-docx

To manipulate Word documents, install the python-docx library (version 0.8.10 at the time of writing):

$ pip install python-docx

Writing Word

Creating a document, adding a title, headings, paragraphs, and saving:

# Import libraries
from docx import Document
from docx.shared import Pt, Inches
from docx.oxml.ns import qn

# Create a new document
doc1 = Document()

# Add a title (level 0)
doc1.add_heading('How to Create and Operate Word with Python', 0)

# Save the document
doc1.save('word1.docx')

The resulting file displays the title as shown in the screenshot.

Headings and Paragraphs

Below is an example that adds headings of different levels, paragraphs, and saves the file:

from docx import Document

doc1 = Document()

doc1.add_heading('How to Create and Operate Word with Python', 0)

doc1.add_paragraph('Word documents are widely used in daily life and work. We usually edit them with WPS or Office, but they can also be generated programmatically with Python.')

doc1.add_heading('Install python-docx', 1)

doc1.add_paragraph('The installation involves two steps:')

doc1.add_heading('Step 1: Install Python', 2)

doc1.add_paragraph('Download the installer from the official Python website and install it.')

doc1.add_heading('Step 2: Install python-docx', 3)

doc1.add_paragraph('Open CMD (Win+R) and run "pip install python-docx".')

doc1.save('word2.docx')

Font and Citation

The following code demonstrates how to set font size, style, color, and add a block quote:

from docx import Document
from docx.shared import Pt, RGBColor
from docx.oxml.ns import qn

doc1 = Document()

doc1.add_heading('How to Use Python to Create and Operate Word', 0)

doc1.add_paragraph('Word documents are widely used...')

doc1.add_heading('Install python-docx', 1)

doc1.add_paragraph('Installation steps...')

# Add a paragraph with custom font size
paragraph = doc1.add_paragraph('This is the installation description!')
run = paragraph.add_run('(Note: font size set to 20)')
run.font.size = Pt(20)

# Set English font
run = doc1.add_paragraph('Here we set the English font: ').add_run('This Font is Times New Roman')
run.font.name = 'Times New Roman'

# Set Chinese font
run = doc1.add_paragraph('Here we set the Chinese font: ').add_run('Current font is SimHei')
run.font.name = 'SimHei'
run._element.rPr.rFonts.set(qn('w:eastAsia'), 'SimHei')

# Italic, bold, underline, color
run = doc1.add_paragraph('This segment: ').add_run('Italic text')
run.italic = True
run = doc1.add_paragraph('This segment: ').add_run('Bold text')
run.bold = True
run = doc1.add_paragraph('This segment: ').add_run('Underlined text')
run.underline = True
run = doc1.add_paragraph('This segment: ').add_run('Red text')
run.font.color.rgb = RGBColor(0xFF, 0x00, 0x00)

# Add a block quote
doc1.add_paragraph('Use Python to change life, change the world, FIGHTING.', style='Intense Quote')

doc1.save('word2.docx')

Lists

Unordered and ordered lists can be created as follows:

from docx import Document

doc2 = Document()

doc2.add_paragraph('Which of the following is not an animal:')
# Unordered list
for item in ['Apple', 'Xiyangyang', 'Lanyangyang', 'Feiyangyang', 'Grey Wolf']:
    doc2.add_paragraph(item, style='List Bullet')

# Ordered list
for item in ['Reach CSDN blog expert', 'Exercise three days a week', 'Learn a new topic daily', 'Read 50 books', 'Reduce overtime']:
    doc2.add_paragraph(item, style='List Number')

doc2.save('word1.docx')

Images and Tables

Insert images and tables into a Word document:

from docx import Document
from docx.shared import Inches

doc2 = Document()

doc2.add_heading('Images', level=2)
# Add picture (adjust path as needed)
doc2.add_picture('C:/Users/Y/Pictures/python-logo.png', width=Inches(5.5))

# Add table
doc2.add_heading('Table', level=2)
table = doc2.add_table(rows=1, cols=4)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = 'ID'
hdr_cells[1].text = 'Name'
hdr_cells[2].text = 'Occupation'

records = (
    (1, '张三', '电工'),
    (2, '张五', '老板'),
    (3, '马六', 'IT'),
    (4, '李四', '工程师')
)
for id, name, work in records:
    row_cells = table.add_row().cells
    row_cells[0].text = str(id)
    row_cells[1].text = name
    row_cells[2].text = work

# Manual page break
doc2.add_page_break()

doc2.save('word1.docx')

Reading Word Files

The following code reads paragraphs and tables from existing Word documents:

from docx import Document

# Open first document
doc1 = Document('word1.docx')
paragraphs = [p.text for p in doc1.paragraphs]
print('###### Content of word1.docx ######')
for p in paragraphs:
    print(p)

# Open second document
doc2 = Document('word2.docx')
print('
###### Content of word2.docx ######')
paragraphs2 = [p.text for p in doc2.paragraphs]
for p in paragraphs2:
    print(p)

# Read tables
for table in doc2.tables:
    for row in table.rows:
        for cell in row.cells:
            print(cell.text, end='  ')
        print()
    print('
')

Python Operating CSV

Introduction

CSV (Comma‑Separated Values) stores tabular data as plain text. It is widely used for data exchange between programs because of its simplicity and low memory consumption.

CSV vs Excel

CSV

Excel

File extension .csv

File extensions .xls or .xlsx

Plain‑text file

Binary file

Stores data only, no formatting or formulas

Can store data, formatting, formulas, etc.

Can be opened with Excel or any text editor

Can only be opened with Excel‑compatible tools

Header row written once

Each column has explicit start and end markers

Consumes less memory when importing

Consumes more memory

Basic Usage

The csv module provides reader, writer, DictReader, and DictWriter classes for reading and writing CSV files.

writer(csvfile, dialect='excel', **fmtparams)

Creates a writer object that converts user data into delimited strings.

import csv

with open('test.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(['id', 'name', 'age'])
    data = [
        ('1001', '张三', '21'),
        ('1002', '李四', '31')
    ]
    writer.writerows(data)

reader(csvfile, dialect='excel', **fmtparams)

Creates a reader object that iterates over rows in the CSV file.

import csv

with open('test.csv', newline='') as csvfile:
    reader = csv.reader(csvfile, delimiter=' ')
    for row in reader:
        print(', '.join(row))

Sniffer Class

Detects CSV dialect and whether a file has a header:

import csv

with open('test.csv', newline='') as csvfile:
    dialect = csv.Sniffer().sniff(csvfile.read(1024))
    csvfile.seek(0)
    reader = csv.reader(csvfile, dialect)
    for row in reader:
        print(row)

Reader Object

Provides next(), dialect, line_num, and for DictReader also fieldnames.

Writer Object

Provides writerow(row), writerows(rows), writeheader(), and dialect.

Read/Write/Append Modes

'r'   : read
'w'   : write (truncates existing file)
'a'   : append
'r+'  : read/write (file must exist)
'w+'  : read/write (creates file if missing)
'a+'  : append/read (creates file if missing)

Binary modes add a 'b' suffix, e.g., 'rb', 'wb', 'ab', etc.

Python‑excel official site: www.python-excel.org/

Promotional QR Code: Scan the QR code below to receive free Python public‑course materials, including e‑books, tutorials, project templates, source code, and more.

Scan the QR code to get the resources for free

Click Read Original for more details.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python CSV Excel Word python-docx DataProcessing xlrd

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.