Python Word Automation and Data Reporting Tutorial with python-docx, win32com, matplotlib, and xlrd
This tutorial walks through setting up a Python environment, installing libraries, creating and editing Word documents with python-docx, converting files and generating PDFs using win32com, extracting Excel data with xlrd, visualizing scores with matplotlib, and automating batch document generation with docx-mailmerge.
This tutorial demonstrates how to set up the Python environment and install required libraries such as python-docx , xlrd , matplotlib , win32com , and docx-mailmerge .
It shows how to create new Word documents, add headings, paragraphs, tables, images, and page breaks using python-docx , with example code.
from docx import Document
document = Document()
document.save('new.docx')It explains how to open and modify existing Word files, insert text at specific ranges, and save changes.
from docx import Document
document = Document('exist.docx')
document.save('new.docx')Using win32com , the guide covers converting .doc to .docx , converting Word to PDF, and controlling Word application visibility.
import os
from win32com.client import Dispatch
def TransDocToDocx(oldDocName, newDocxName):
word = Dispatch('Word.Application')
doc = word.Documents.Open(oldDocName)
doc.SaveAs(newDocxName, 12) # 12 = docx format
doc.Close()
word.Quit()Data extraction from Excel with xlrd is presented, followed by merging names and scores, sorting, and visualizing results with matplotlib bar charts.
import xlrd
xlsx = xlrd.open_workbook('学生成绩表格.xlsx')
sheet = xlsx.sheet_by_index(0)
nameList = [str(sheet.cell_value(i, 1)) for i in range(1, sheet.nrows)]
scoreList = [int(sheet.cell_value(i, 3)) for i in range(1, sheet.nrows)] import matplotlib.pyplot as plt
def GenerateScorePic(scoreList):
xNameList = [s[0] for s in scoreList]
yScoreList = [s[1] for s in scoreList]
plt.figure(figsize=(10, 5))
plt.bar(x=xNameList, height=yScoreList, color='steelblue', alpha=0.8)
plt.title('学生成绩柱状图')
plt.xlabel('学生姓名')
plt.ylabel('学生成绩')
plt.xticks(rotation=90)
plt.tight_layout()
plt.savefig('studentScore.jpg')
plt.show()The generated chart image is then inserted into a Word report, which includes a summary table of scores.
from docx import Document
from docx.shared import Inches
def GenerateScoreReport(scoreOrder, picPath):
document = Document()
document.add_heading('数据分析报告', 0)
# Add a table with names and scores
table = document.add_table(rows=1, cols=2)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = '学生姓名'
hdr_cells[1].text = '学生分数'
for name, score in scoreOrder:
row_cells = table.add_row().cells
row_cells[0].text = name
row_cells[1].text = str(score)
# Insert the chart image
document.add_picture(picPath, width=Inches(6))
document.save('学生成绩报告.docx')Finally, the article introduces docx-mailmerge for batch generation of personalized documents such as salary certificates, providing a function to produce thousands of files efficiently.
from mailmerge import MailMerge
template = '薪资证明模板.docx'
document = MailMerge(template)
document.merge(
name='唐星',
id='1010101010',
year='2020',
salary='99999',
job='嵌入式软件开发工程师'
)
document.write('生成的1份证明.docx')Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.