Automate Word with Python: Master win32com for Document Manipulation
This tutorial explains how to use Python's win32com library to control Microsoft Word, covering installation, creating and displaying documents, working with Selection, Range, Font, ParagraphFormat, PageSetup and Styles objects, and providing a complete example that formats a document to meet national standards.
1. Hello, world!
Install the pypiwin32 package (e.g., pip install pypiwin32 ) and use Python's IDLE for interactive testing.
2. Show Word
Create a Word application object and make it visible:
<code>from win32com.client import Dispatch
app = Dispatch('Word.Application')
app.Visible = 1 # show the Word window</code>Then add a new document:
<code>doc = app.Documents.Add()</code>3. Input Text
Obtain the current selection (the only Selection in the document) and write text:
<code>s = app.Selection
s.Text = 'Hello, world!'</code>The selection now contains the string "Hello, world!".
4. Inspect Selection
The Selection object represents the cursor focus or highlighted range. Its default property Text can be read or set, similar to Python's __str__ method. Calling s() is equivalent to s.Text .
5. Word Object Model Overview
Key objects:
Application : the Word application, containing menus, toolbars, documents, etc.
Document : a Word file; multiple documents can be open simultaneously.
Selection : the current cursor or highlighted range; only one can be active at a time.
Range : a continuous region defined by start and end positions, independent of the selection.
Font : font properties of a range or selection.
ParagraphFormat : paragraph formatting such as alignment, indentation, line spacing.
PageSetup : page layout settings (margins, size, grid).
Styles : collection of document styles (e.g., Normal, Heading 1).
6. Working with Selection
Common operations:
<code># Replace selected text
s.Text = 'Hello, world!'
# Insert text at the cursor
s.TypeText('Hello, world!')
# Copy, paste, delete
s.Copy()
s.Paste()
s.Delete()
# Move the cursor
s.MoveLeft()
s.MoveRight(1, 2)
# Set selection range by character indices
s.Start = 0
s.End = n</code>7. Using Range
Obtain a range from a document or selection:
<code>r = doc.Range()
# or
r = s.Range()</code>Ranges can be manipulated similarly to selections but can exist without affecting the visible cursor.
8. Font and ParagraphFormat
Access and modify font and paragraph settings:
<code>font = s.Font
font.Name = '仿宋'
font.Size = 16
pf = s.ParagraphFormat
pf.Alignment = 0 # left
pf.LineSpacingRule = 0 # single
pf.LeftIndent = 21
pf.RightIndent = 21</code>9. PageSetup
Configure page margins and layout (values are in points; 1 cm ≈ 28.35 pt):
<code>cm_to_points = 28.35
ps = doc.PageSetup
ps.TopMargin = 3.3 * cm_to_points
ps.BottomMargin = 3.3 * cm_to_points
ps.LeftMargin = 2.8 * cm_to_points
ps.RightMargin = 2.6 * cm_to_points
ps.LayoutMode = 1
ps.CharsLine = 28
ps.LinesPage = 22
ps.FooterDistance = 2.8 * cm_to_points
ps.OddAndEvenPagesHeaderFooter = 0</code>10. Styles
Modify the built‑in "Normal" style to use the required font and size:
<code>normal = doc.Styles(-1)
normal.Font.Name = '仿宋'
normal.Font.Size = 16</code>11. Problem‑Solving Approach
When a needed feature is not obvious, record a macro in Word to view the generated VBA, consult the .NET API documentation, or explore the object browser (F2) to discover which objects expose the required properties.
12. Full Example: Formatting a Document to the National Standard
The script below sets page margins, font, line/character grid, and custom page numbers according to the official standard:
<code>from win32com.client import Dispatch
app = Dispatch('Word.Application')
app.Visible = True
doc = app.Documents.Open('path/to/your.docx')
# Page margins (cm → points)
cm_to_points = 28.35
ps = doc.PageSetup
ps.TopMargin = ps.BottomMargin = 3.3 * cm_to_points
ps.LeftMargin = 2.8 * cm_to_points
ps.RightMargin = 2.6 * cm_to_points
# Normal style font
normal = doc.Styles(-1)
for attr in ['Name', 'NameFarEast', 'NameAscii', 'NameOther']:
setattr(normal.Font, attr, '仿宋')
normal.Font.Size = 16
# Grid settings
ps.LayoutMode = 1
ps.CharsLine = 28
ps.LinesPage = 22
# Page numbers in footer
w = doc.Windows(1)
w.View.SeekView = 4 # footer view
s = w.Selection
s.HeaderFooter.PageNumbers.StartingNumber = 1
s.HeaderFooter.PageNumbers.NumberStyle = 0
s.WholeStory()
s.Delete()
s.HeaderFooter.PageNumbers.Add(4)
s.MoveLeft(1, 2)
s.TypeText('— ')
s.MoveRight()
s.TypeText(' —')
s.Font.Name = '宋体'
s.Font.Size = 14
s.ParagraphFormat.LeftIndent = s.ParagraphFormat.RightIndent = 21
# Remove bottom border from header style
header_style = doc.Styles('页眉')
header_style.ParagraphFormat.Borders(-3).LineStyle = 0
</code>Running this script produces a Word document that complies with the latest national formatting standard.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.