Fundamentals 28 min read

Python Tutorial: Working with Excel, Word, and CSV Files

This article provides a comprehensive Python tutorial covering how to read, write, and modify Excel files using xlrd/xlwt/xlutils, manipulate Word documents with python-docx, and handle CSV data via the csv module, including code examples, library installation, and practical usage tips.

Python Programming Learning Circle

Oct 26, 2024

Python Tutorial: Working with Excel, Word, and CSV Files

Common Tools for Excel

Data processing is a major use case for Python, and Excel is the most popular data‑processing software. Python offers many ready‑made libraries for handling Excel, such as xlrd & xlwt & xlutils, XlsxWriter, and OpenPyXL. On Windows you can also call Microsoft Excel’s open interface directly. The table below compares the features of these tools:

Type

xlrd&xlwt&xlutils

XlsxWriter

OpenPyXL

Excel Open Interface

Read

Supported

Not supported

Supported

Write

Supported

Modify

Supported

Not supported

Supported

xls

Supported

Not supported

Supported

xlsx

High version

Supported

Large file

Not supported

Supported

Not supported

Efficiency

Fast

Very slow

Functionality

Weak

Powerful

Average

Very powerful

Below we mainly introduce the most commonly used xlrd & xlwt & xlutils series.

xlrd & xlwt & xlutils Introduction

The three libraries consist of: xlrd: read Excel files; xlwt: write Excel files; xlutils: utilities for copying, splitting, filtering, etc.

Installation

Install the three libraries with pip:

$ pip install xlrd xlwt xlutils

Writing Excel

Example code for writing an Excel file:

# Import xlwt library
import xlwt

# Create an xls workbook object
wb = xlwt.Workbook()

# Add two sheets
sh1 = wb.add_sheet('成绩')
sh2 = wb.add_sheet('汇总')

# Write header and data (row, column)
sh1.write(0, 0, '姓名')
sh1.write(0, 1, '专业')
sh1.write(0, 2, '科目')
sh1.write(0, 3, '成绩')

sh1.write(1, 0, '张三')
sh1.write(1, 1, '信息与通信工程')
sh1.write(1, 2, '数值分析')
sh1.write(1, 3, 88)

sh1.write(2, 0, '李四')
sh1.write(2, 1, '物联网工程')
sh1.write(2, 2, '数字信号处理分析')
sh1.write(2, 3, 95)

sh1.write(3, 0, '王华')
sh1.write(3, 1, '电子与通信工程')
sh1.write(3, 2, '模糊数学')
sh1.write(3, 3, 90)

# Write second sheet
sh2.write(0, 0, '总分')
sh2.write(1, 0, 273)

# Save file
wb.save('test.xls')

Running the code generates test.xls which can be opened to view the data.

Reading Excel

Example code for reading the previously created file:

# Import xlrd library
import xlrd

# Open the workbook
wb = xlrd.open_workbook('test_w.xls')

print('sheet count:', wb.nsheets)
print('sheet names:', wb.sheet_names())

sh1 = wb.sheet_by_index(0)
print('sheet %s has %d rows %d cols' % (sh1.name, sh1.nrows, sh1.ncols))
print('Value of first row second column:', sh1.cell_value(0, 1))

rows = sh1.row_values(0)
cols = sh1.col_values(1)
print('First row values:', rows)
print('Second column values:', cols)
print('Cell type of second row first column:', sh1.cell(1, 0).ctype)

for sh in wb.sheets():
    for r in range(sh.nrows):
        print(sh.row(r))

The output includes cell types where 1 represents a string, as shown in the following table:

Value

Type

Description

empty

Empty

string

String

number

Number

date

Date

boolean

Boolean

error

Error

Modifying Excel

Modifications are performed via xlutils ’s copy method:

# Import modules
import xlrd
from xlutils.copy import copy

# Open the original file
readbook = xlrd.open_workbook('test_w.xls')

# Copy the workbook
wb = copy(readbook)

# Get first sheet and write new data in row 5
sh1 = wb.get_sheet(0)
sh1.write(4, 0, '王欢')
sh1.write(4, 1, '通信工程')
sh1.write(4, 2, '机器学习')
sh1.write(4, 3, 89)

# Modify second sheet
sh1 = wb.get_sheet(1)
sh1.write(1, 0, 362)

# Save
wb.save('test.xls')

The resulting file shows the added row and updated total.

Formatting Operations

Using xlwt you can set fonts, colors, alignment, merging, and formulas:

# Import xlwt
import xlwt

# Define styles
styleBR = xlwt.easyxf('font: name Times New Roman, color-index red, bold on')
styleNum = xlwt.easyxf(num_format_str='#,##0.00')
styleDate = xlwt.easyxf(num_format_str='YYYY-MM-DD')

wb = xlwt.Workbook()
sh1 = wb.add_sheet('成绩')
sh2 = wb.add_sheet('汇总')

# Write header with style
sh1.write(0, 0, '姓名', styleBR)
sh1.write(0, 1, '日期', styleBR)
sh1.write(0, 2, '成绩', styleBR)

# Insert data with formatting
sh1.write(1, 0, '张三')
sh1.write(1, 1, '2020-07-01', styleDate)
sh1.write(1, 2, 90, styleNum)

# Center alignment and merge cells
alignment = xlwt.Alignment()
alignment.horz = xlwt.Alignment.HORZ_CENTER
style = xlwt.XFStyle()
style.alignment = alignment
sh1.write_merge(3, 3, 0, 1, '总分', style)
sh1.write(3, 2, xlwt.Formula('C2+C3'))

# Write second sheet
sh2.write(0, 0, '总分', styleBR)
sh2.write(1, 0, 185)

wb.save('test.xls')

Python Manipulating Word Documents

Installing python-docx

$ pip install python-docx
# Output shows successful installation of python-docx 0.8.10

Creating a Word Document

Use Document() and add_heading to create titles, sections, paragraphs, and more.

# Import libraries
from docx import Document
from docx.shared import Pt, Inches
from docx.oxml.ns import qn

# Create a new document
doc1 = Document()

doc1.add_heading('如何使用 Python 创建和操作 Word', 0)

doc1.add_paragraph('Word 文档在我们现在的生活和工作中都用的比较多...')

doc1.add_heading('安装 python-docx 库', 1)

doc1.add_paragraph('现在开始我们来介绍如何安装 python-docx 库，具体需要以下两步操作：')

doc1.add_heading('第一步：安装 Python', 2)

doc1.add_paragraph('在python官网下载python安装包进行安装。')

doc1.add_heading('第二步：安装 python-docx 库', 3)

doc1.add_paragraph('window下win+R输入CMD打开命令行，输入pip install python-docx即可下载。')

doc1.save('word1.docx')

Formatting Text and Adding Quotes

# Continue from previous document
paragraph = doc1.add_paragraph('这是第二步的安装描述！')
run = paragraph.add_run('(注意：这里设置了字号为20)')
run.font.size = Pt(20)

run = doc1.add_paragraph('这里设置英文字体：').add_run('This Font is Times New Roman ')
run.font.name = 'Times New Roman'

run = doc1.add_paragraph('这里设置中文字体：').add_run('当前字体为黑体')
run.font.name = '黑体'
run._element.rPr.rFonts.set(qn('w:eastAsia'), '黑体')

run = doc1.add_paragraph('这段设置：').add_run('文字的是斜体 ')
run.italic = True

run = doc1.add_paragraph('这段再设置：').add_run('这里设置粗体')
run.bold = True

run = doc1.add_paragraph('这段为下划线：').add_run('这里设置带下划线')
run.underline = True

run = doc1.add_paragraph('这段字体为红色：').add_run('这里设置字体为红色')
run.font.color.rgb = RGBColor(0xFF, 0x00, 0x00)

doc1.add_paragraph('这里是我们引用的一段话：用Python改变人生，改变世界，FIGHTING。', style='Intense Quote')

doc1.save('word2.docx')

Lists, Images, and Tables

# Unordered and ordered lists
from docx import Document

doc2 = Document()

doc2.add_paragraph('哪个不是动物：')
for item in ['苹果', '喜洋洋', '懒洋洋', '沸洋洋', '灰太狼']:
    doc2.add_paragraph(item, style='List Bullet')

doc2.add_paragraph('2020年度计划：')
for item in ['CSDN达到博客专家', '每周健身三天', '每天学习一个新知识点', '学习50本书', '减少加班时间']:
    doc2.add_paragraph(item, style='List Number')

# Add image
from docx.shared import Inches

doc2.add_heading('图片', level=2)

doc2.add_picture('C:/Users/Y/Pictures/python-logo.png', width=Inches(5.5))

# Add table
doc2.add_heading('表格', level=2)

table = doc2.add_table(rows=1, cols=4)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = '编号'
hdr_cells[1].text = '姓名'
hdr_cells[2].text = '职业'

records = ((1, '张三', '电工'), (2, '张五', '老板'), (3, '马六', 'IT'), (4, '李四', '工程师'))
for id, name, work in records:
    row_cells = table.add_row().cells
    row_cells[0].text = str(id)
    row_cells[1].text = name
    row_cells[2].text = work

doc2.add_page_break()

doc2.save('word1.docx')

Reading Word Files

# Read existing documents
from docx import Document

doc1 = Document('word1.docx')
paragraphs = [p.text for p in doc1.paragraphs]
print('###### 输出word1文章的内容 ######')
for p in paragraphs:
    print(p)

doc2 = Document('word2.docx')
print('
###### 输出word2文章内容 ######')
for p in [p.text for p in doc2.paragraphs]:
    print(p)

# Print tables
for table in doc2.tables:
    for row in table.rows:
        for cell in row.cells:
            print(cell.text, end='  ')
        print()
    print('
')

Python Working with CSV Files

Introduction

CSV (Comma‑Separated Values) stores tabular data as plain text. It is widely used for data exchange between programs because of its simplicity and low memory consumption compared with binary Excel files.

CSV vs Excel

CSV

Excel

File extension .csv

File extension .xls or .xlsx

Plain‑text file

Binary file

Stores data only, no formatting or formulas

Can store data, formatting, formulas, etc.

Can be opened with Excel or any text editor

Can only be opened with Excel‑type tools

Only one header row

Each cell has start and end markers

Lower memory usage when importing

Higher memory usage

Basic Usage

The csv module provides reader, writer, DictReader, and DictWriter for reading and writing CSV data.

Writer Example

import csv

with open('test.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(['id', 'name', 'age'])
    data = [('1001', '张三', '21'), ('1002', '李四', '31')]
    writer.writerows(data)

Reader Example

import csv

with open('test.csv', newline='') as csvfile:
    reader = csv.reader(csvfile, delimiter=' ')
    for row in reader:
        print(', '.join(row))

Sniffer Class

Detects CSV dialects and whether a file has a header row.

import csv

with open('test.csv', newline='') as csvfile:
    dialect = csv.Sniffer().sniff(csvfile.read(1024))
    csvfile.seek(0)
    reader = csv.reader(csvfile, dialect)
    for row in reader:
        print(row)

Reader Object Methods

next()

: returns the next row. dialect: read‑only description of the CSV format. line_num: number of lines read so far. fieldnames: column names for DictReader.

Writer Object Methods

writerow(row)

: writes a single row. writerows(rows): writes multiple rows. writeheader(): writes a header row (for DictWriter). dialect: read‑only description used by the writer.

Read/Write Modes

Typical file modes: 'r' read, 'w' write, 'a' append, 'r+' read/write, 'w+' write/read, 'a+' append/read. Add a b for binary files.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

FileIO DataProcessing

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.