Understanding Selenium WebDriver, HTML Fundamentals, and Element Locators (XPath & CSS Selectors)
This article explains the distinction between selenium‑webdriver and browser drivers, introduces HTML basics and element attributes, and details Selenium's element‑locating APIs with practical examples of XPath and CSS selector strategies for UI automation.
In my current work I use the UI automation framework pytest+selenium . I discovered that selenium‑webdriver is a library that wraps the browser's native API, while a webdriver is the driver software provided by the browser vendor and implements the W3C WebDriver protocol.
The W3C WebDriver protocol defines a remote‑control interface that allows scripts to operate browsers across platforms and languages, effectively giving us a programmable backdoor to manipulate the DOM.
In summary, Selenium commands drive the webdriver , which in turn interacts with the browser engine; differences among browsers lead to variations in how elements are handled, so Selenium‑webdriver must account for each browser.
HTML Overview
UI automation operates on HTML documents, so a quick recap of HTML is useful. HTML consists of tags, attributes, and text. Example tags include <!DOCTYPE HTML> , <!--...--> , <html> , and <body> .
Tags are enclosed in angle brackets, usually appear in pairs (e.g., <html>…</html> ) but some are self‑closing like <br> or <hr> . An element is a tag together with its content, and element attributes are key‑value pairs written inside the start tag.
Boolean attributes such as disabled can be declared without a value to indicate a disabled input field.
<!DOCTYPE HTML> # DOCTYPE indicates an HTML document
<!--...--> # HTML comment
<html> # Root element
<body> # Body contentSelenium API
Selenium provides many locating APIs based on different element attributes. The most common methods are:
Locator
Single‑element API
Multiple‑element API
Example
id
find_element_by_id()
find_elements_by_id()
driver.find_element_by_id("result_logo")
name
find_element_by_name()
find_elements_by_name()
driver.find_element_by_name("f")
class_name
find_element_by_class_name()
find_elements_by_class_name()
driver.find_element_by_class_name("fm")
tag_name
find_element_by_tag_name()
find_elements_by_tag_name()
driver.find_element_by_tag_name("a")
link_text
find_element_by_link_text()
find_elements_by_link_text()
driver.find_element_by_link_text("index")
partial_link_text
find_element_by_partial_link_text()
find_elements_by_partial_link_text()
driver.find_element_by_partial_link_text("in")
xpath
find_element_by_xpath()
find_elements_by_xpath()
css selector
find_element_by_css_selector()
find_elements_by_css_selector()
find_element returns the first matching element; find_elements returns a list of all matches, raising NoSuchElementException if none are found. Because unique attributes are rare, XPath or CSS selectors are often preferred.
Best locating practices:
Avoid deep hierarchical XPath; prefer relative paths.
Prefer CSS selectors over XPath for better performance, as XPath requires the driver to traverse the entire DOM tree.
XPath Locating
XPath (XML Path Language) works on HTML because HTML’s tree structure mirrors XML. Nodes represent elements, attributes, or text. XPath expressions can be absolute (starting from the root) or relative (starting with // ).
Examples:
/html/body/div[1]/div[1]/div[5]/div/div/form/span/input[1] # absolute path
//input[@id='kw'] # relative pathPredicates inside [] filter node sets, supporting logical operators and , or , and the union operator | . Axes such as parent , child , ancestor , descendant , etc., allow navigation relative to the current node.
Example using axes:
//form/div[last()-1]/ancestor::div[@class='modal-content']CSS Selector Locating
CSS selectors are another powerful way to locate elements and are generally faster than XPath because they are native to the browser’s rendering engine.
Selector
Example
Meaning
element
$('input')
All
inputelements
#id
$('#kw')
Element with
id="kw".class
$('.s_ipt')
Elements with class
s_ipt[attribute]
$('[type]')
Elements that have a
typeattribute
[attribute=value]
$('[name="wd"]')
Elements where
name="wd"e1>e2
$('span>input')
inputthat is a direct child of a
spane1 e2
$('a div')
divthat is a descendant of an
ae1+e2
$('div+a')
aimmediately following a
dive1:nth-child(n)
$('span>span:nth-child(1)')
First
spanchild of a
spanExamples in Selenium code:
find_element_by_css_selector("input#kw") # input with id="kw"
find_element_by_css_selector("input.s_ipt") # input with class="s_ipt"
find_element_by_css_selector('a[src$=".pdf"]') #
whose src ends with .pdf
find_element_by_css_selector("[name='wd'][autocomplete='off']") # element with two specific attributesAbout the Author
The author, Ze Yang, is a DevOps practitioner who shares enterprise‑level DevOps operations and development techniques, focusing on Linux, automation, and related courses.
Promotional material for a DevOps pipeline course is included at the end of the original page.
DevOps Cloud Academy
Exploring industry DevOps practices and technical expertise.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.