Understanding Selenium and Appium: History, Architecture, and WebDriver Protocol
This article explains the origins, evolution, and technical architecture of Selenium and Appium, detailing their shared WebDriver foundation, the JSON Wire Protocol, code examples, and how client, service, and browser components interact during automated web and mobile testing.
Selenium and Appium are the most frequently used tools for web and mobile automation testing, respectively, and are encountered daily by testers.
1. Introduction
The article first presents two historical stories: Selenium began in 2004 when Jason Huggins launched Selenium 1.0 with the Selenium RC module; in 2006 Simon Stewart started the WebDriver project as a competitor; in 2011 Selenium 2.0 merged Selenium 1.0 and WebDriver; and in 2016 Selenium 3.0 removed RC entirely. Appium’s story starts in 2011 with Dan Cuellar’s IOSAuto project, which Jason Huggins advised in 2012, leading to the renaming to Appium; Appium 1.0 was released in 2014, extending the WebDriver protocol.
Both projects share the same founder, Jason Huggins, which explains why Selenium and Appium both operate via WebDriver.
WebDriver uses native browser APIs, providing higher speed and stability, but each browser implements its own driver (e.g., FirefoxDriver, ChromeDriver) due to differences in element handling.
2. Understanding
The WebDriver Wire Protocol is a W3C standard overseen by Selenium. All browser-specific WebDrivers implement this protocol, exposing a web service on a port. Selenium and Appium communicate with browsers using the JSON Wire Protocol (JSONWP), which defines the JSON‑based messages.
JSON (JavaScript Object Notation) is a lightweight, language‑independent data‑exchange format with a clear hierarchical structure, making it ideal for transmitting test commands.
JSONWP is a predefined communication protocol that exposes a RESTful API; the Mobile JSON Wire Protocol (MJSONWP) extends JSONWP to support mobile‑specific actions, which is why Appium builds on WebDriver.
Since both Selenium and Appium are based on WebDriver, the article uses Selenium as an example.
2.1 Architecture
Execution is performed by WebDriver and can be divided into two main service functions: receiving (translating) requests and executing commands.
1> The Service receives a request, translates it into a command, and returns the result. 2> The Service sends the translated command to the browser for execution.
Requests are sent via HttpClient with a JSON‑formatted body that tells the browser what to do. The client uses a WebElement object to interact with page elements.
The CommandExecutor maintains a map that converts simple command keys into URLs, following REST principles. The relevant code snippet is:
nameToUrl = ImmutableMap.builder()
.put(NEW_SESSION, post("/session"))
.put(QUIT, delete("/session/:sessionId"))
.put(GET_CURRENT_WINDOW_HANDLE, get("/session/:sessionId/window_handle"))
.put(GET_WINDOW_HANDLES, get("/session/:sessionId/window_handles"))
.put(GET, post("/session/:sessionId/url"))
// The Alert API is still experimental and should not be used.
.put(GET_ALERT, get("/session/:sessionId/alert"))
.put(DISMISS_ALERT, post("/session/:sessionId/dismiss_alert"))
.put(ACCEPT_ALERT, post("/session/:sessionId/accept_alert"))
.put(GET_ALERT_TEXT, get("/session/:sessionId/alert_text"))
.put(SET_ALERT_VALUE, post("/session/:sessionId/alert_text"));2.2 Process
Typical usage example:
WebDriver driver = newChromeDriver();
driver.get("http://www.google.com");Authentication among Client, Service, and Browser is handled via a sessionId generated by the client and maintained by the service. When a new WebDriver instance starts, the service launches in a separate thread.
Example request and response flow:
Client sends a POST request to /session/.../url with JSON body {"url":"http://google.com"} .
Server translates the request, forwards it to the browser, and returns a JSON response such as {"sessionId":"285b12e4-2b8a-4fe6-90e1-c35cba245956","name":"get","status":0,"value":""} , where sessionId identifies the session, name indicates the method, status shows execution success, and value holds the result.
The article concludes with a promotional note about Qtest, a professional testing team under 360, and a QR‑code invitation to follow their public account for daily testing technology updates.
360 Quality & Efficiency
360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.