Backend Development 6 min read

XML File Parsing Methods and a Groovy‑Based DOM4J Utility Class

The article explains four XML parsing approaches (DOM, SAX, JDOM, DOM4J), highlights the use of Groovy for scripting, provides a sample XML snippet, and presents a complete Groovy/Java utility class that leverages dom4j to parse XML files for backend applications.

FunTester
FunTester
FunTester
XML File Parsing Methods and a Groovy‑Based DOM4J Utility Class

XML file parsing can be performed in four ways: DOM parsing , SAX parsing , JDOM parsing , and DOM4J parsing . The first two are basic, platform‑independent methods provided by the official API, while the latter two are Java‑specific extensions built on the basic methods; a DOM‑based XML parsing class has already been implemented.

The author continues to use the Groovy language; readers interested in Groovy can refer to the article "From Java to Groovy Evolution". Additional advanced features, such as using Groovy in JMeter scripts to support Java (i.e., Groovy) scripts, are available via the public account.

Below is a truncated example of the XML file being parsed:

...

The following utility class, written in Groovy/Java, uses dom4j to read and traverse the XML structure, converting each element into a NodeInfo object with attributes and child nodes:

package com.fun.utils.xml

import com.fun.base.exception.FailException
import com.fun.frame.SourceCode
import org.dom4j.*
import org.dom4j.io.SAXReader
import org.slf4j.Logger
import org.slf4j.LoggerFactory

/**
 * Utility class based on dom4j for parsing XML files
 */
class XMLUtil2 extends SourceCode {
    private static Logger logger = LoggerFactory.getLogger(XMLUtil2.class)

    static List
parse(String path) {
        SAXReader reader = new SAXReader()
        try {
            Document document = reader.read(path.startsWith("http") ? new URL(path) : new File(path))
            Element rootElement = document.getRootElement()
            def iterator = rootElement.elementIterator()
            List
info = new ArrayList<>()
            while (iterator.hasNext()) {
                info << parseNode(iterator.next() as Element)
            }
            return info
        } catch (DocumentException e) {
            logger.error("Failed to parse file ${path}!", e)
        }
        FailException.fail("Failed to parse file ${path}!")
    }

    static NodeInfo parseNode(Element e) {
        if (e.getNodeType() != Node.ELEMENT_NODE) return null
        def info = new NodeInfo()
        List
attributes = e.attributes()
        List
attrs = new ArrayList<>()
        attributes.each {
            attrs << new Attr(it.name, it.value)
        }
        info.setAttrs(attrs)
        List
children = new ArrayList<>()
        def iterator = e.elementIterator()
        if (iterator.hasNext()) {
            children << parseNode(iterator.next() as Element)
        }
        info.setChildren(children)
        return info
    }
}

Future optimizations and improvements are planned; the latest code can be found at the author's GitHub repository (https://github.com/JunManYuanLong/FunTester). The console output is split into header and footer sections, accompanied by illustrative images.

The article is originally published on the "FunTester" public account, which is an original content sharing platform recommended by Tencent Cloud, Juejin, and Zhihu.

backendJavaXMLUtilityGroovyXML ParsingDOM4J
FunTester
Written by

FunTester

10k followers, 1k articles | completely useless

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.