Fundamentals 12 min read

Python XML Parsing with xml.etree.ElementTree and xml.dom.minidom

This tutorial explains how to parse, query, modify, and delete XML data in Python using the built‑in xml.etree.ElementTree and xml.dom.minidom modules, providing step‑by‑step code examples for reading files, handling strings, accessing elements, attributes, and writing updated XML back to disk.

Python Programming Learning Circle

Nov 10, 2023

Python XML Parsing with xml.etree.ElementTree and xml.dom.minidom

Python provides two built‑in modules for working with XML: xml.etree.ElementTree (a lightweight tree API) and xml.dom.minidom (a minimal DOM implementation). Both can read XML from files or strings, navigate the hierarchical structure, and modify the document.

What is XML? XML (eXtensible Markup Language) is a markup language for representing structured data, similar in appearance to HTML but designed for data interchange between client and server.

Example XML file Sample.xml used throughout the tutorial:

<?xml version="1.0" encoding="UTF-8"?>
<metadata>
    <food>
        <item name="breakfast">Idly</item>
        <price>$2.5</price>
        <description>Two idly's with chutney</description>
        <calories>553</calories>
    </food>
    ... (other food items) ...
</metadata>

Using xml.etree.ElementTree

Parse a file with parse():

import xml.etree.ElementTree as ET
mytree = ET.parse('sample.xml')
myroot = mytree.getroot()

Parse a string with fromstring():

import xml.etree.ElementTree as ET
data = '''<?xml version="1.0" encoding="UTF-8"?>
<metadata>
    <food>
        <item name="breakfast">Idly</item>
        <price>$2.5</price>
        <description>Two idly's with chutney</description>
        <calories>553</calories>
    </food>
</metadata>'''
myroot = ET.fromstring(data)
print(myroot.tag)

Access root element, child tags, and text:

print(myroot)                     # <Element 'metadata' at 0x...>
print(myroot[0].tag)               # food
for x in myroot[0]:
    print(x.tag, x.attrib)        # item {'name': 'breakfast'} etc.
for x in myroot[0]:
    print(x.text)                 # Idly, $2.5, Two idly's with chutney, 553

Find specific elements and attributes:

for x in myroot.findall('food'):
    item = x.find('item').text
    price = x.find('price').text
    print(item, price)

Modify XML – add, update, or delete nodes:

# Add a new attribute to each description
for description in myroot.iter('description'):
    new_desc = description.text + ' will be served'
    description.text = new_desc
    description.set('updated', 'yes')
mytree.write('new.xml')

# Add a new sub‑element
ET.SubElement(myroot[0], 'speciality')
for x in myroot.iter('speciality'):
    x.text = 'South Indian Special'
mytree.write('output5.xml')

# Delete an attribute
myroot[0][0].attrib.pop('name', None)
mytree.write('output5.xml')

# Remove a child element
myroot[0].remove(myroot[0][0])
mytree.write('output6.xml')

# Clear all children of a tag
myroot[0].clear()
mytree.write('output7.xml')

Using xml.dom.minidom

Parse a file with parse():

from xml.dom import minidom
p1 = minidom.parse('sample.xml')
print(p1)

Parse a string with parseString():

p3 = minidom.parseString('<myxml>Using<empty/> parseString</myxml>')
print(p3)

Access elements by tag name:

dat = minidom.parse('sample.xml')
item_node = dat.getElementsByTagName('item')[0]
print(item_node)                     # <DOM Element: item at ...>
print(item_node.attributes['name'].value)   # breakfast
print(item_node.firstChild.data)            # Idly

Iterate over all items and count them:

items = dat.getElementsByTagName('item')
for x in items:
    print(x.firstChild.data)
print('Total items:', len(items))

The article concludes with a brief promotion for a free Python public course and links to related tutorials.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Parsing XML tutorial elementtree minidom

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.